Home Journals RIA Semi-Automatic formalization of a patient/doctor vocabulary for breast cancer

JOURNAL METRICS

CiteScore 2023: ℹCiteScore:

CiteScore is the number of citations received by a journal in one year to documents published in the three previous years, divided by the number of documents indexed in Scopus published in those same three years.

SCImago Journal Rank (SJR) 2023: ℹSCImago Journal Rank (SJR):

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

Source Normalized Impact per Paper (SNIP) 2023: ℹSource Normalized Impact per Paper(SNIP):

SNIP measures a source’s contextual citation impact by weighting citations based on the total number of citations in a subject field. It helps you make a direct comparison of sources in different subject fields. SNIP takes into account characteristics of the source's subject field, which is the set of documents citing that source.

Semi-Automatic formalization of a patient/doctor vocabulary for breast cancer

Université de Montpellier, France

Université Paul Valéry, Montpellier 3, France

Institut Montpelliérain Alexander Grothendieck, France

Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier, France

Institut du Cancer Montpellier, Montpellier, France

Biostatistique et Processus Spatiaux, INRA Avignon, France

Corresponding Author Email:

mike-donald.tapi-nzali@univ-montp2.fr, jerome.aze@lirmm.fr, sandra.bringay@lirmm.fr, christian.lavergne@univ-montp3.fr, caroline.mollevi@icm.unicancer.fr, thomas.optiz@paca.inra.fr

Received:

N/A

| |

Accepted:

N/A

| | Citation

ria30_5_05_nzali.pdf

OPEN ACCESS

Abstract:

Nowadays, social media is increasingly used by patients and health professionals. Most often, the patients are lay in the medical field, they use slang, abbreviations, and their own vocabulary during their exchanges. In order to automatically analyze texts from social networks, we need a specific vocabulary. Considering a corpus of documents from messages from social media like forums and Facebook, we describe the construction of a lexical resource that aligns the vocabulary of patients to that of health professionals. In order to build this resource and transform it into a SKOS ontology, we use several methods taking into account the linguistic and statistical aspects proposed in the literature. On the one hand, this work will improve information retrieval in health forums and on the other hand it will facilitate the development of statistical studies based on information extracted from these forums.

Keywords:

information extraction, social media, statistic-based measure, ontology, patient vocabulary.

1. Introduction

2. Motivations et état de l’art

3. Méthodes

4. Résultats

5. Formalisation de la ressource sous la forme d’une ontologie en SKOS

6. Discussion

7. Conclusion et perspectives

Remerciements

Ces travaux ont été financés par l’ANR SFIR (Semantic Indexing of French Biomedical Data Resources) et par par l’Institut de Recherche en Santé Publique (http:/ /www.iresp.net).

References

Bouamor D., Llanos L. C., Ligozat A.-L., Rosset S., Zweigenbaum P. (2016). Transfer-based learning-to-rank assessment of medical term technicality. In Proceedings of the Tenth International Conference on Language Resources and Evaluation, p. 2312–2316.

Buscaldi D., Rosso P. (2006). Mining knowledge from wikipedia for the question answering task. In Proceedings of the International Conference on Language Resources and Evaluation, p. 727–730.

Chernov S., Iofciu T., Nejdl W., Zhou X. (2006). Extracting semantics relationships between wikipedia categories. Semantic Wiki, vol. 206, p. 153-163.

Cilibrasi R. L., Vitanyi P. (2007). The google similarity distance. IEEE Transactions on Knowledge and Data Engineering, vol. 19, no 3, p. 370–383.

Dice L. R. (1945). Measures of the amount of ecologic association between species. Ecology, vol. 26, no 3, p. 297–302.

Doing-Harris K. M., Zeng-Treitler Q. (2011). Computer-assisted update of a consumer health vocabulary through mining of social network data. Journal of Medical Internet Research, vol. 13, no 2, p. e37.

Elhadad N., Zhang S., Driscoll P., Brody S. (2014). Characterizing the sublanguage of online breast cancer forums for medications, symptoms, and emotions. In American Medical Informatics Association, Annual Symposium, p. 516-525.

Fiscella K., Meldrum S., Franks P., Shields C. G., Duberstein P., McDaniel S. H. et al. (2004). Patient trust: is it related to patient-centered behavior of primary care physicians? Medical Care, vol. 42, no 11, p. 1049–1055.

Gabrilovich E., Markovitch S. (2007). Computing Semantic Relatedness Using Wikipediabased Explicit Semantic Analysis. International Joint Conference on Artificial Intelligence, vol. 7, p. 1606–1611.

Hamon T., Grabar N. (2015). Acquisition of medical terminology for ukrainian from parallel corpora and wikipedia. In Terminologie Intelligence Artificielle, p. 71-79.

Hancock J. T., Toma C., Ellison N. (2007). The truth about lying in online dating profiles. In Proceedings of the SIGCHI conference on Human factors in computing systems, p. 449–452.

Islam A., Milios E. E., Keselj V. (2012). Comparing Word Relatedness Measures Based on Google n-grams. In International Conference on Computational Linguistics, p. 495-506.

Lafourcade M., Joubert A. (2012). Increasing long tail in weighted lexical networks. In Cognitive Aspects of the Lexicon, International Conference on Computational Linguistics, p. 5-20.

Lossio-Ventura J. A., Jonquet C., Roche M., Teisseire M. (2014a). Biotex: A system for biomedical terminology extraction, ranking, and validation. In Proceedings of the 2014 International Conference on Posters & Demonstrations Track-Volume 1272, p. 157–160.

Lossio-Ventura J. A., Jonquet C., Roche M., Teisseire M. (2014b). Integration of linguistic and web information to improve biomedical terminology extraction. In Proceedings of the 18th International Database Engineering & Applications Symposium, p. 265–269.

Lossio-Ventura J. A., Jonquet C., Roche M., Teisseire M. (2014c). Yet another ranking function for automatic multiword term extraction. In International Conference on Natural Language Processing, p. 52–64.

Lossio-Ventura J. A., Jonquet C., Roche M., Teisseire M. (2016). Biomedical term extraction: overview and a new methodology. Information Retrieval Journal, vol. 19, no 1-2, p. 59–99.

Lu K., Mao J., Li G. (2015). Enhancing subject metadata with automated weighting in the medical domain: A comparison of different measures. In International Conference on Asian Digital Libraries, p. 158–168.

MacLean D. L., Heer J. (2013). Identifying medical terms in patient-authored text: a crowdsourcing-based approach. Journal of the American Medical Informatics Association, vol. 20, no 6, p. 1120–1127.

Merolli M., Gray K., Martin-Sanchez F. (2013). Health outcomes and related effects of using social media in chronic disease management: A literature review and analysis of affordances. Journal of Biomedical Informatics, vol. 46, no 6, p. 957–969.

Miles A., Bechhofer S. (2005). Skos simple knowledge organization system reference. In W3C Recommendation, World Wide Web Consortium, http://www.w3.org/TR/skosreference/, consulté le 18 février 2016. Consulté sur http://www.w3.org/TR/skos-reference/,18August2009

Nalawade R., Samal A., Avhad K. (2016). Improved similarity measure for text classification and clustering. In International Research Journal of Engineering and Technology, p. 214–219.

Noy N. F., Shah N. H., Whetzel P. L., Dai B., Dorf M., Griffith N. et al. (2009). Bioportal: ontologies and integrated data resources at the click of a mouse. In Nucleic Acids Research, p. 170-173. Oxford Univ Press.

Opitz T., Azé J., Bringay S., Joutard C., Lavergne C., Mollevi C. (2014). Breast cancer and quality of life: medical information extraction from health forums. In Medical Informatics Europe, p. 1070–1074.

Paternostre M., Francq P., Lamoral J., Wartel D., Saerens M. (2002). Carry, un algorithme de désuffixation pour le français. Technical report, Paul Otlet Institute, 15 pages.

Ponzetto S. P., Strube M. (2006). Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p. 192–199.

Ramesh B. P., Houston T. K., Brandt C., Fang H., Yu H. (2013). Improving patients’ electronic health record comprehension with noteaid. In World Congress on Health and Biomedical Informatics, p. 714–718.

Sadilek A., Kautz H. A., Silenzio V. (2012). Modeling spread of disease from social interactions. In International Conference on Weblogs and Social Media, p. 322–329.

Solomou G., Papatheodorou T. (2010). The use of SKOS vocabularies in digital repositories: the DSpace case. In International Conference on Semantic Computing, p. 542–547.

Summers E., Isaac A., Redding C., Krech D. (2008). Lcsh, skos and linked data. In International Conference on Dublin Core and Metadata Applications, p. 25-33.

Van Assem M., Malaisé V., Miles A., Schreiber G. (2006). A method to convert thesauri to skos. In European Semantic Web Conference, p. 95-109. Springer.

Wang P., Hu J., Zeng H.-J., Chen Z. (2009). Using wikipedia knowledge to improve text classification. Knowledge and Information Systems, vol. 19, no 3, p. 265–281.

Witten I., Milne D. (2008). An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, AAAI Press, Chicago, USA, p. 25–30.

Wu D. T., Hanauer D. A., Mei Q., Clark P. M., An L. C., Lei J. et al. (2013). Applying multiple methods to assess the readability of a large corpus of medical documents. In World Congress on Health and Biomedical Informatics, p. 647–651.

Zadeh R. B., Goel A. (2013). Dimension independent similarity computation. The Journal of Machine Learning Research, vol. 14, no 1, p. 1605–1626.

Zeng Q. T., Tse T. (2006). Exploring and developing consumer health vocabularies. Journal of the American Medical Informatics Association, vol. 13, no 1, p. 24–29.

Zeng Q. T., Tse T., Divita G., Keselman A., Crowell J., Browne A. C. et al. (2007). Term identification methods for consumer health vocabulary development. Journal of Medical Internet Research, vol. 9, no 1, p. e4.

Zheng Y., Mobasher B., Burke R. (2015). Integrating context similarity with sparse linear recommendation model. In International Conference on User Modeling, Adaptation, and Personalization, p. 370–376.

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

Semi-Automatic formalization of a patient/doctor vocabulary for breast cancer