Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to jump a single sample? #6732

Open
1 task done
sxj1215 opened this issue Jan 21, 2025 · 3 comments
Open
1 task done

How to jump a single sample? #6732

sxj1215 opened this issue Jan 21, 2025 · 3 comments
Labels
bug Something isn't working pending This problem is yet to be addressed

Comments

@sxj1215
Copy link

sxj1215 commented Jan 21, 2025

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.2.dev0
  • Platform: Linux-5.15.0-124-generic-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • PyTorch version: 2.3.1+cu121 (GPU)
  • Transformers version: 4.45.2
  • Datasets version: 2.20.0
  • Accelerate version: 0.34.0
  • PEFT version: 0.11.1
  • TRL version: 0.9.6
  • GPU type: NVIDIA A100 80GB PCIe
  • DeepSpeed version: 0.14.4

Reproduction

    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2052, in train
    return inner_training_loop(
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2388, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3479, in training_step
    inputs = self._prepare_inputs(inputs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3429, in _prepare_inputs
    if len(inputs) == 0:
TypeError: object of type 'NoneType' has no len()

Others

I am using the streaming mode to train the QWEN-VL2 model with huggingface datasets. I have tried the method mentioned in #6233, but it doesn't work. I want to know how to figure out this issue or how to just skip the NoneType samples. Many thanks!

@sxj1215 sxj1215 added bug Something isn't working pending This problem is yet to be addressed labels Jan 21, 2025
@hiyouga
Copy link
Owner

hiyouga commented Jan 22, 2025

We cannot reproduce this issue

@sxj1215
Copy link
Author

sxj1215 commented Jan 22, 2025

Many thanks for your reply! I do not meet this issue with the same datasets when I do not use the streaming model. However, when I use the streaming model, the issue merges. I guess some samples may only have one modality. So I want to skip the samples if it's nonetype.

33%|███▎ | 41290/123871 [17:09:07<32:10:16, 1.40s/it]Traceback (most recent call last): File "/usr/local/bin/llamafactory-cli", line 8, in <module> sys.exit(main()) File "/LLaMA-Factory/src/llamafactory/cli.py", line 112, in main run_exp() File "/LLaMA-Factory/src/llamafactory/train/tuner.py", line 92, in run_exp _training_function(config={"args": args, "callbacks": callbacks}) File "/LLaMA-Factory/src/llamafactory/train/tuner.py", line 66, in _training_function run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 101, in run_sft train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2052, in train return inner_training_loop( File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2388, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3479, in training_step inputs = self._prepare_inputs(inputs) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3429, in _prepare_inputs if len(inputs) == 0: TypeError: object of type 'NoneType' has no len()

@hiyouga
Copy link
Owner

hiyouga commented Jan 22, 2025

Could you try #5346

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

2 participants