#AttributeError: 'pyarrow.lib.ListType' object has no attribute 'list_size' #13

aditya6396 · 2025-01-08T06:07:10Z

thank you for replay but arrange my data according to you done it last replay but in the time of the nomalizing the datasets but its giving me this error mentation it below
i m using this command
uv run python scripts/fit_embedding_normalizer.py --ds pretraining_data:1 --save_path "my local path" --max_nb_samples 1000000

#my dataymal file
name: "pretraining_data"
parquet_path:
s3: "wiki_data"
source_column: "text_sentences_sonar_emb"
source_text_column: "text_sentences"
partition columns:
###this coming due to the python version i using the python 3.10 should i used the python 3.11##

Traceback (most recent call last):
File "/home/cpatwadityasharma/large_concept_model/scripts/fit_embedding_normalizer.py", line 101, in
main(args.ds, args.save_path, args.max_nb_samples)
File "/home/cpatwadityasharma/large_concept_model/scripts/fit_embedding_normalizer.py", line 79, in main
embs = sample_sentences_from_mixed_sources(
File "/home/cpatwadityasharma/large_concept_model/scripts/fit_embedding_normalizer.py", line 52, in sample_sentences_from_mixed_sources
vecs = pyarrow_fixed_size_array_to_numpy(pc.list_flatten(batch[column]))[
File "/home/cpatwadityasharma/large_concept_model/.venv/lib/python3.10/site-packages/stopes/utils/arrow_utils.py", line 152, in pyarrow_fixed_size_array_to_numpy
assert cc.type.list_size is not None
AttributeError: 'pyarrow.lib.ListType' object has no attribute 'list_size'
0it [00:01, ?it/s]

artemru · 2025-01-09T08:39:00Z

probably similar issue with missing FixedSizeListArray typing #9 (comment)
(see also #12)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#AttributeError: 'pyarrow.lib.ListType' object has no attribute 'list_size' #13

#AttributeError: 'pyarrow.lib.ListType' object has no attribute 'list_size' #13

aditya6396 commented Jan 8, 2025

artemru commented Jan 9, 2025

#AttributeError: 'pyarrow.lib.ListType' object has no attribute 'list_size' #13

#AttributeError: 'pyarrow.lib.ListType' object has no attribute 'list_size' #13

Comments

aditya6396 commented Jan 8, 2025

artemru commented Jan 9, 2025