Official codebase for the paper "ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning".
| Webpage | Paper | Dataset(Huggingface), Dataset(ModelScope) |
- Added local data loaderUsers can now load custom queries locally. When specifying non-default splits_name values (e.g., "abc") in run_exp, the system will automatically load corresponding files from evaluation/default_splits/abc.txt, where the TXT file contains the target query filenames.
- Detailed constraints classification.See detailed docs at Evaluation README
- Introduced LLM-modulo baseline Implement the LLM-modulo pipeline with a ground-truth symbolic verifier. Based on methodology from: Paper: Robust Planning with Compound LLM Architectures: An LLM-Modulo Approach Codebase: https://github.com./Atharva-Gundawar/LLM-Modulo-prompts
- Support local LLMs inference with Qwen3-8B/4B.
- Create a conda environment and install dependencies:
conda create -n chinatravel python=3.9
conda activate chinatravel
pip install -r requirements.txt
- Download the database and unzip it to the chinatravel/environment/ directory
Download Links: Google Drive, NJU Drive
We support the deepseek (offical API from deepseek), gpt-4o (chatgpt-4o-latest), glm4-plus, and local inferences with qwen (Qwen2.5-7B-Instruct).
export OPENAI_API_KEY=""
python run_exp.py --splits easy --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits medium --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits human --agent LLMNeSy --llm deepseek --oracle_translation
python run_exp.py --splits human --agent LLMNeSy --llm deepseek
Note: please download the model weights to the "project_root_path/chinatravel/open_source_llm/Qwen2.5-7B-Instruct/".
python eval_exp.py --splits human --method LLMNeSy_deepseek_oracletranslation
python eval_exp.py --splits human --method LLMNeSy_deepseek
If you have any problems, please contact Jie-Jing Shao, Bo-Wen Zhang, Xiao-Wen Yang.
If our paper or related resources prove valuable to your research, we kindly ask for citation.
@misc{shao2024chinatravelrealworldbenchmarklanguage,
title={ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning},
author={Jie-Jing Shao and Xiao-Wen Yang and Bo-Wen Zhang and Baizhi Chen and Wen-Da Wei and Guohao Cai and Zhenhua Dong and Lan-Zhe Guo and Yu-feng Li},
year={2024},
eprint={2412.13682},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2412.13682},
}