Welcome to the official HuggingFace repository for BiMediX, the bilingual medical Large Language Model (LLM) designed for English and Arabic interactions. BiMediX facilitates a broad range of medical interactions, including multi-turn chats, multiple-choice Q&A, and open-ended question answering.
For full details of this model please read our paper (pre-print) and check our GitHub.
Check our preview at 🔗!
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "BiMediX/BiMediX-Bi"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
text = "Hello BiMediX! I've been experiencing increased tiredness in the past week."
inputs = tokenizer(text, return_tensors="pt").to('cuda')
outputs = model.generate(**inputs, max_new_tokens=500)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
The BiMediX model, built on a Mixture of Experts (MoE) architecture, leverages the Mixtral-8x7B base model. It features a router network to allocate tasks to the most relevant experts, each being a specialized feedforward blocks within the model. This approach enables the model to scale significantly by utilizing a sparse operation method, where less than 13 billion parameters are active during inference, enhancing efficiency. The training utilized the BiMed1.3M dataset, focusing on bilingual medical interactions in both English and Arabic, with a substantial corpus of over 632 million healthcare-specialized tokens. The model's fine-tuning process includes a low-rank adaptation technique (QLoRA) to efficiently adapt the model to specific tasks while keeping computational demands manageable.
Model Name | Link Download |
---|---|
BiMediX-Bi | HuggingFace |
BiMediX-Ara | HuggingFace |
BiMediX-Eng | HuggingFace |
The BiMediX model was evaluated across several benchmarks, demonstrating its effectiveness in medical language understanding and question answering in both English and Arabic.
Medical Benchmarks Used for Evaluation:
Results and Comparisons:
These results underscore BiMediX's advanced capability in handling medical queries and its significant improvement over existing models in both languages, leveraging its unique bilingual dataset and training approach.
This release, intended for research, is not ready for clinical or commercial use. Users are urged to employ BiMediX responsibly, especially when applying its outputs in real-world medical scenarios. It is imperative to verify the model's advice with qualified healthcare professionals and not to rely on AI for medical diagnoses or treatment decisions. Despite the overall advancements BiMediX brings to the field of medical NLP, it shares common challenges with other language models, including hallucinations, toxicity, and stereotypes. BiMediX's medical diagnoses and recommendations are not infallible.
BiMediX is released under the CC-BY-NC-SA 4.0 License. For more details, please refer to the LICENSE file included in this repository.
If you use BiMediX in your research, please cite our work as follows:
@misc{pieri2024bimedix,
title={BiMediX: Bilingual Medical Mixture of Experts LLM},
author={Sara Pieri and Sahal Shaji Mullappilly and Fahad Shahbaz Khan and Rao Muhammad Anwer and Salman Khan and Timothy Baldwin and Hisham Cholakkal},
year={2024},
eprint={2402.13253},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Visit our GitHub for more information and resources.