Bobbie-model ★

The research collective has hinted at a 13B version with Mixture of Depths (MoD) later this year. Until then, Bobbie-7B deserves a spot in your evaluation pipeline.

Bobbie loses marginally on standard benchmarks but dramatically outperforms on long-context retrieval (RULER). At 32k context, Bobbie is also 36% faster than Llama-3 due to its BiGLU and windowed attention strategy. 5. How to Use Bobbie-Model The model is available on Hugging Face as bobbie-collective/bobbie-7b-base and bobbie-7b-instruct . Transformers Example from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name = "bobbie-collective/bobbie-7b-instruct" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto" ) bobbie-model

| Stage | Dataset | Tokens | Purpose | |-------|---------|--------|---------| | 1 | RedPajama (v2) | 1.2T | Base language modeling | | 2 | SlimPajama + CodeAlpaca | 400B | Code & reasoning | | 3 | Synthetic multi-turn chat | 50B | Instruction following | The research collective has hinted at a 13B

| Benchmark | Bobbie-7B | Llama-3-8B | Mistral-7B | |-----------|-----------|------------|------------| | MMLU (5-shot) | 64.2 | 66.7 | 63.9 | | GSM8K (8-shot) | 52.8 | 54.9 | 50.3 | | HumanEval (pass@1) | 32.5 | 34.2 | 31.8 | | | 82.3 | 67.1 | 71.4 | | Inference tokens/sec | 98 | 72 | 88 | At 32k context, Bobbie is also 36% faster

If you’ve been following the open-source LLM space, you’ve likely memorized the specs of Llama 3, Mixtral, and Qwen. But a new contender has been quietly gaining traction in the "small model" category: .