2024-11-21 16:21:33 +08:00
|
|
|
### Install Git LFS
|
|
|
|
Before you begin, make sure Git Large File Storage (Git LFS) is installed on your system. Install it using the following command:
|
2024-11-21 16:16:02 +08:00
|
|
|
|
2024-11-21 16:21:33 +08:00
|
|
|
```bash
|
|
|
|
git lfs install
|
|
|
|
```
|
|
|
|
|
|
|
|
### Download the Model from Hugging Face
|
|
|
|
To download the `PDF-Extract-Kit` model from Hugging Face, use the following command:
|
|
|
|
|
|
|
|
```bash
|
|
|
|
git lfs clone https://huggingface.co/opendatalab/PDF-Extract-Kit
|
|
|
|
```
|
|
|
|
|
|
|
|
Ensure that Git LFS is enabled during the clone to properly download all large files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Download the Model from ModelScope
|
|
|
|
|
|
|
|
#### SDK Download
|
|
|
|
|
|
|
|
```bash
|
|
|
|
# First, install the ModelScope library using pip:
|
|
|
|
pip install modelscope
|
|
|
|
```
|
|
|
|
|
|
|
|
```python
|
|
|
|
# Use the following Python code to download the model using the ModelScope SDK:
|
|
|
|
from modelscope import snapshot_download
|
|
|
|
model_dir = snapshot_download('opendatalab/PDF-Extract-Kit')
|
|
|
|
```
|
|
|
|
|
|
|
|
#### Git Download
|
|
|
|
Alternatively, you can use Git to clone the model repository from ModelScope:
|
|
|
|
|
|
|
|
```bash
|
|
|
|
git clone https://www.modelscope.cn/opendatalab/PDF-Extract-Kit.git
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Put [model files]() here:
|
|
|
|
|
|
|
|
```
|
|
|
|
./
|
|
|
|
├── Layout
|
|
|
|
│ ├── config.json
|
|
|
|
│ └── model_final.pth
|
|
|
|
├── MFD
|
|
|
|
│ └── weights.pt
|
|
|
|
├── MFR
|
|
|
|
│ └── UniMERNet
|
|
|
|
│ ├── config.json
|
|
|
|
│ ├── preprocessor_config.json
|
|
|
|
│ ├── pytorch_model.bin
|
|
|
|
│ ├── README.md
|
|
|
|
│ ├── tokenizer_config.json
|
|
|
|
│ └── tokenizer.json
|
|
|
|
├── TabRec
|
|
|
|
│ └── StructEqTable
|
|
|
|
│ ├── config.json
|
|
|
|
│ ├──generation_config.json
|
|
|
|
│ ├──model.safetensors
|
|
|
|
│ ├──preprocessor_config.json
|
|
|
|
│ ├──special_tokens_map.json
|
|
|
|
│ ├──spiece.model
|
|
|
|
│ ├──tokenizer_config.json
|
|
|
|
│ └──tokenizer.json
|
|
|
|
└── README.md
|
|
|
|
```
|