BEGIN:VCALENDAR
PRODID:-//github.com/ical-org/ical.net//NONSGML ical.net 4.0//EN
VERSION:2.0
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:STANDARD
DTSTART:20241103T020000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZNAME:EST
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20250309T020000
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZNAME:EDT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
DESCRIPTION:This session features two talks: "A Systematic Framework for S
 caling High-Fidelity Medical Multimodal Data Curation" by Hyunjae Kim\, P
 hD - Postdoctoral Associate in Biomedical Informatics and Data Science Ab
 stract: Medical multimodal learning is often hindered by the scarcity of 
 high-quality image-text data. While scientific literature is a vast resou
 rce\, extracting clinically relevant\, deconstructed\, and aligned image-
 caption pairs at scale remains challenging. We propose MedPMC\, an automa
 ted five-stage pipeline to curate high-fidelity medical datasets from lar
 ge-scale biomedical repositories. The framework features task-specialized
  components for precise image filtering and sophisticated separation of m
 ulti-panel figures and captions to ensure accurate image-text alignment. 
 Leveraging this pipeline\, we curated 12 million medical image-text pairs
 . A CLIP-style model trained on this dataset surpassed state-of-the-art p
 erformance across 20+ benchmarks in six clinical specialties\, including 
 radiology and pathology. Furthermore\, integrating our model into a multi
 modal LLM outperformed baselines on medical QA tasks by 3.6%. Crucially\,
  MedPMC-trained models enhance performance on internal clinical data\, un
 derscoring their utility in real-world settings. This scalable framework 
 establishes a new paradigm for transforming biomedical literature into co
 ntinuously updatable\, clinically grounded training resources. " S-index 
 – A Refined Data Sharing Index to Promote and Reward Biomedical Data Reus
 e" by Kalpana Raja\, PhD\, MRSB\, CSci - Instructor of Biomedical Informa
 tics and Data Science Abstract: Data sharing has become increasingly reco
 gnized as essential for accelerating scientific discovery\, enhancing tra
 nsparency\, and maximizing the return on research investments. Efforts su
 ch as the FAIR principles (Findable\, Accessible\, Interoperable\, and Re
 usable) and NIH policies on Data Management and Sharing have underscored 
 the importance of making datasets widely available to the scientific comm
 unity. Despite these advances\, current practices lack quantitative metri
 cs that accurately reflect researchers' contributions to data sharing\, p
 articularly the downstream reuse of their datasets by the broader scienti
 fic community. Existing citation metrics\, such as the H-index\, predomin
 antly measure scholarly impact through publications\, neglecting the crit
 ical role of dataset creation and reuse. Consequently\, there is a pressi
 ng need for a novel index to quantify and incentivize dataset reuse\, fos
 tering a robust culture of open and impactful scientific data sharing. To
  address this\, we propose the Data Sharing index (S-index)\, a refined m
 etric specifically designed to quantify a researcher’s contribution to re
 usable data. We have built an end-to-end workflow for S-index computation
  and a web-based interface for visualization and demonstrated feasibility
  using a real-world repository (OpenNeuro).\n\nSpeakers:\nHyunjae Kim\; K
 alpana Raja\n\nAdmission:\nFree\n\nDetails URL:\nhttps://medicine.yale.ed
 u/event/nlpllm-interest-group-29/\n
DTEND;TZID=America/New_York:20260330T170000
DTSTAMP:20260514T210244Z
DTSTART;TZID=America/New_York:20260330T160000
LOCATION:Join our mailing list to receive Zoom Passcode: https://mailman.y
 ale.edu/mailman/listinfo/nlp-llm-ig\, URL: https://yale.zoom.us/j/9359994
 1969
SEQUENCE:0
STATUS:Confirmed
SUMMARY:NLP/LLM Interest Group
UID:a4a7536b-4285-49b8-ab48-8d8c49a470a0
END:VEVENT
END:VCALENDAR
