Lifestyle and Environmental Factors in Social History Sections Download Lifestyle and environmental factors play a significant role both in clinical research as well as clinical care. We created a corpus from the MTSamples website (http://www.mtsamples.com/), a large collection of publicly available transcribed medical records. From this resource, we identified 516 reports of history and physical notes which were expected to contain very rich social history information. We then further applied our in-house statistical section chunker and identified 364 sections tagged as social history from these 516 reports. We annotated these sections for tobacco, alcohol, drug abuse with status, method, type, amount, frequency, exposure history and quit history. In addition we annotated the following factors: occupation, marital status, family information, residence, living situation, environmental exposure, physical activity, weight management, sexual history, infectious disease history. corp0.jpg Details of the corpus can be found in the following papers: M. Yetisgen, L. Vanderwende. Automatic Identification of Substance Abuse from Social History in Clinical Text. To Appear in the Proceedings of Artificial Intelligence in Medicine, Vienna, June 2017. M. Yetisgen, E. Pellicer, D.R. Crosslin, L. Vanderwende. Automatic Identification of Lifestyle and Environmental Factors from Social History in Clinical Text. In Proceedings of AMIA 2016 Joint Summits on Translational Science, San Francisco, 2016.