forked from ailab/ShieldLM-7B-internlm2
16 lines
990 B
Markdown
16 lines
990 B
Markdown
---
|
|
license: mit
|
|
language:
|
|
- en
|
|
- zh
|
|
---
|
|
## Introduction
|
|
The ShieldLM model ([paper link](https://arxiv.org/abs/2402.16444)) initialized from [internlm2-chat-7b](https://huggingface.co/internlm/internlm2-chat-7b). ShieldLM is a bilingual (Chinese and English) safety detector that mainly aims to help to detect safety issues in LLMs' generations. It aligns with general human safety standards, supports fine-grained customizable detection rules, and provides explanations for its decisions.
|
|
Refer to our [github repository](https://github.com/thu-coai/ShieldLM) for more detailed information.
|
|
|
|
## Usage
|
|
Please refer to our [github repository](https://github.com/thu-coai/ShieldLM) for the detailed usage instructions.
|
|
|
|
## Performance
|
|
ShieldLM demonstrates impressive detection performance across 4 ID and OOD test sets, compared to strong baselines such as GPT-4, Llama Guard and Perspective API.
|
|
Refer to [our paper](https://arxiv.org/abs/2402.16444) for more detailed evaluation results. |