Qodo-Embed-1-1.5B/README.md

---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- transformers
- Qwen2
license: other
license_name: qodoai-open-rail-m
license_link: LICENSE
pipeline_tag: sentence-similarity
library_name: sentence-transformers
base_model: Alibaba-NLP/gte-Qwen2-1.5B-instruct
---


## Qodo-Embed-1 
**Qodo-Embed-1 is a state-of-the-art** code embedding model designed for retrieval tasks in the software development domain.
It is offered in two sizes: lite (1.5B) and medium (7B). The model is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search, retrieval-augmented generation (RAG), and contextual understanding of programming languages.
This model outperforms all previous open-source models in the COIR and MTEB leaderboards, achieving best-in-class performance with a significantly smaller size compared to competing models.

### Languages Supported: 
* Python
* C++
* C#
* Go
* Java
* Javascript
* PHP
* Ruby
* Typescript


## Model Information
- Model Size: 1.5B 
- Embedding Dimension: 1536
- Max Input Tokens: 32k

## Requirements
```
transformers>=4.39.2
flash_attn>=2.5.6
```

## Usage

### Sentence Transformers

```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")
# Run inference
sentences = [
    'accumulator = sum(item.value for item in collection)',  
    'result = reduce(lambda acc, curr: acc + curr.amount, data, 0)',  
    'matrix = [[i*j for j in range(n)] for i in range(n)]'  
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1536]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

### Transformers

```python
import torch
import torch.nn.functional as F

from torch import Tensor
from transformers import AutoTokenizer, AutoModel


def last_token_pool(last_hidden_states: Tensor,
                 attention_mask: Tensor) -> Tensor:
    left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
    if left_padding:
        return last_hidden_states[:, -1]
    else:
        sequence_lengths = attention_mask.sum(dim=1) - 1
        batch_size = last_hidden_states.shape[0]
        return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]


# Each query must come with a one-sentence instruction that describes the task
queries = [
      'how to handle memory efficient data streaming',
      'implement binary tree traversal'
  ]

documents = [
        """def process_in_chunks():
            buffer = deque(maxlen=1000)
            for record in source_iterator:
                buffer.append(transform(record))
                if len(buffer) >= 1000:
                    yield from buffer
                    buffer.clear()""",

        """class LazyLoader:
            def __init__(self, source):
                self.generator = iter(source)
                self._cache = []

            def next_batch(self, size=100):
                while len(self._cache) < size:
                    try:
                        self._cache.append(next(self.generator))
                    except StopIteration:
                        break
                return self._cache.pop(0) if self._cache else None""",

        """def dfs_recursive(root):
            if not root:
                return []
            stack = []
            stack.extend(dfs_recursive(root.right))
            stack.append(root.val)
            stack.extend(dfs_recursive(root.left))
            return stack"""
    ]
input_texts = queries + documents

tokenizer = AutoTokenizer.from_pretrained('Qodo/Qodo-Embed-1-1.5B', trust_remote_code=True)
model = AutoModel.from_pretrained('Qodo/Qodo-Embed-1-1.5B', trust_remote_code=True)

max_length = 8192

# Tokenize the input texts
batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt')
outputs = model(**batch_dict)
embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])

# normalize embeddings
embeddings = F.normalize(embeddings, p=2, dim=1)
scores = (embeddings[:2] @ embeddings[2:].T) * 100
print(scores.tolist())
```


## License
[QodoAI-Open-RAIL-M](https://www.qodo.ai/open-rail-m-license/)
<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->
first commit 2025-03-03 14:23:52 +08:00			`---`
			`tags:`
			`- sentence-transformers`
			`- sentence-similarity`
			`- feature-extraction`
			`- transformers`
			`- Qwen2`
			`license: other`
			`license_name: qodoai-open-rail-m`
			`license_link: LICENSE`
			`pipeline_tag: sentence-similarity`
			`library_name: sentence-transformers`
			`base_model: Alibaba-NLP/gte-Qwen2-1.5B-instruct`
			`---`
Initial commit 2025-03-03 13:49:59 +08:00
first commit 2025-03-03 14:23:52 +08:00


			`## Qodo-Embed-1`
			`Qodo-Embed-1 is a state-of-the-art code embedding model designed for retrieval tasks in the software development domain.`
			`It is offered in two sizes: lite (1.5B) and medium (7B). The model is optimized for natural language-to-code and code-to-code retrieval, making it highly effective for applications such as code search, retrieval-augmented generation (RAG), and contextual understanding of programming languages.`
			`This model outperforms all previous open-source models in the COIR and MTEB leaderboards, achieving best-in-class performance with a significantly smaller size compared to competing models.`

			`### Languages Supported:`
			`* Python`
			`* C++`
			`* C#`
			`* Go`
			`* Java`
			`* Javascript`
			`* PHP`
			`* Ruby`
			`* Typescript`


			`## Model Information`
			`- Model Size: 1.5B`
			`- Embedding Dimension: 1536`
			`- Max Input Tokens: 32k`

			`## Requirements`
			```
			`transformers>=4.39.2`
			`flash_attn>=2.5.6`
			```

			`## Usage`

			`### Sentence Transformers`

			```python
			`from sentence_transformers import SentenceTransformer`

			`# Download from the 🤗 Hub`
			`model = SentenceTransformer("Qodo/Qodo-Embed-1-1.5B")`
			`# Run inference`
			`sentences = [`
			`'accumulator = sum(item.value for item in collection)',`
			`'result = reduce(lambda acc, curr: acc + curr.amount, data, 0)',`
			`'matrix = [[i*j for j in range(n)] for i in range(n)]'`
			`]`
			`embeddings = model.encode(sentences)`
			`print(embeddings.shape)`
			`# [3, 1536]`

			`# Get the similarity scores for the embeddings`
			`similarities = model.similarity(embeddings, embeddings)`
			`print(similarities.shape)`
			`# [3, 3]`
			```

			`### Transformers`

			```python
			`import torch`
			`import torch.nn.functional as F`

			`from torch import Tensor`
			`from transformers import AutoTokenizer, AutoModel`


			`def last_token_pool(last_hidden_states: Tensor,`
			`attention_mask: Tensor) -> Tensor:`
			`left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])`
			`if left_padding:`
			`return last_hidden_states[:, -1]`
			`else:`
			`sequence_lengths = attention_mask.sum(dim=1) - 1`
			`batch_size = last_hidden_states.shape[0]`
			`return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]`


			`# Each query must come with a one-sentence instruction that describes the task`
			`queries = [`
			`'how to handle memory efficient data streaming',`
			`'implement binary tree traversal'`
			`]`

			`documents = [`
			`"""def process_in_chunks():`
			`buffer = deque(maxlen=1000)`
			`for record in source_iterator:`
			`buffer.append(transform(record))`
			`if len(buffer) >= 1000:`
			`yield from buffer`
			`buffer.clear()""",`

			`"""class LazyLoader:`
			`def __init__(self, source):`
			`self.generator = iter(source)`
			`self._cache = []`

			`def next_batch(self, size=100):`
			`while len(self._cache) < size:`
			`try:`
			`self._cache.append(next(self.generator))`
			`except StopIteration:`
			`break`
			`return self._cache.pop(0) if self._cache else None""",`

			`"""def dfs_recursive(root):`
			`if not root:`
			`return []`
			`stack = []`
			`stack.extend(dfs_recursive(root.right))`
			`stack.append(root.val)`
			`stack.extend(dfs_recursive(root.left))`
			`return stack"""`
			`]`
			`input_texts = queries + documents`

			`tokenizer = AutoTokenizer.from_pretrained('Qodo/Qodo-Embed-1-1.5B', trust_remote_code=True)`
			`model = AutoModel.from_pretrained('Qodo/Qodo-Embed-1-1.5B', trust_remote_code=True)`

			`max_length = 8192`

			`# Tokenize the input texts`
			`batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt')`
			`outputs = model(**batch_dict)`
			`embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])`

			`# normalize embeddings`
			`embeddings = F.normalize(embeddings, p=2, dim=1)`
			`scores = (embeddings[:2] @ embeddings[2:].T) * 100`
			`print(scores.tolist())`
			```




			`## License`
			`[QodoAI-Open-RAIL-M](https://www.qodo.ai/open-rail-m-license/)`
			`<!--`
			`## Glossary`

			`Clearly define terms in order to be accessible across audiences.`
			`-->`

			`<!--`
			`## Model Card Authors`

			`Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.`
			`-->`

			`<!--`
			`## Model Card Contact`

			`Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.`
			`-->`