Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WASM] Integer Overflow on 'size' after stat'ing #22924

Open
andrewmd5 opened this issue Jan 16, 2025 · 14 comments
Open

[WASM] Integer Overflow on 'size' after stat'ing #22924

andrewmd5 opened this issue Jan 16, 2025 · 14 comments

Comments

@andrewmd5
Copy link

andrewmd5 commented Jan 16, 2025

I've come up against a bit of a wall while porting Perl to WASM (using wasi-sdk/wasi-libc). I've gotten everything working, except for an issue with files over 2GB - when stat'ing them, the file size comes back wrong:

FileStat [WASI C# HOST]: /Users/andrew/Downloads/artifact-33/largefile 3221225472
-1073741824

The negative size (-1073741824) suggests we're hitting a 32-bit integer overflow, which is odd since the build is configured with both use64bitint=define and use64bitall=define. I've double-checked that ivsize is set to 8 and lseeksize is 8 in the config.

Before opening this issue I confirmed it wasn't an upstream issue in clang or wasi-libc. I'm attaching my full perl config output below.

I understand this may not be a priority issue since wasm32-wasi isn't an official target yet, but I'd appreciate any guidance on where I can look within Perl's source to potentially patch this, or if there are any specific flags in my configuration hint file I should try.

Full Configuration
Summary of my perl5 (revision 5 version 41 subversion 7) configuration:
   
  Platform:
    osname=wasi
    osvers=wasi25
    archname=wasm32-wasi
    uname='linux fv-az1253-838 6.5.0-1025-azure #26~22.04.1-ubuntu smp thu jul 11 22:33:04 utc 2024 x86_64 x86_64 x86_64 gnulinux '
    config_args='-sde -Dinc_version_list=none -Ddlsrc=none -Dloclibpth= -Dglibpth= -Dlns=/bin/ln -Dman1dir=none -Dman3dir=none -Dosname=wasi -Darchname=wasm32-wasi -Dosvers=wasi25 -Dmyhostname=objex.ai -Dmydomain=objex.ai -Dperladmin=root -Dcc=wasic -Dld=wasic -Dar=/opt/wasi-sdk/bin/llvm-ar -Dranlib=/opt/wasi-sdk/bin/llvm-ranlib -Doptimize=-O2 -Dlibs=-lm -Dhintfile=wasi -Dhostperl=/home/runner/work/test/test/wasm/../native/miniperl -Dhostgenerate=/home/runner/work/test/test/wasm/../native/generate_uudmap -Dprefix=/home/runner/work/test/test/wasm/prefix -Dsysroot=/opt/wasi-sdk/share/wasi-sysroot -Dusedevel -Dstatic_ext=mro Devel/Peek File/DosGlob File/Glob Sys/Syslog Sys/Hostname PerlIO/via PerlIO/mmap PerlIO/encoding B attributes Unicode/Normalize Unicode/Collate threads threads/shared IPC/SysV re Digest/MD5 Digest/SHA SDBM_File Math/BigInt/FastCalc Data/Dumper I18N/Langinfo Time/Piece IO Hash/Util/FieldHash Hash/Util Filter/Util/Call Encode/Unicode Encode Encode/JP Encode/KR Encode/EBCDIC Encode/CN Encode/Symbol Encode/Byte Encode/TW Compress/Raw/Zlib Compress/Raw/Bzip2 MIME/Base64 Cwd Storable List/Util Fcntl Opcode'
    hint=recommended
    useposix=true
    d_sigaction=undef
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='wasic'
    ccflags ='$ccflags -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -DNO_MATHOMS -mllvm -wasm-enable-sjlj -Wno-implicit-function-declaration -D_WASI_EMULATED_PROCESS_CLOCKS -lwasi-emulated-process-clocks -D_WASI_EMULATED_GETPID -lwasi-emulated-getpid -D_GNU_SOURCE -D_POSIX_C_SOURCE -Wno-null-pointer-arithmetic -D_WASI_EMULATED_SIGNAL -lwasi-emulated-signal -include /opt/wasi-sdk/share/wasi-sysroot/include/wasm32-wasi/fcntl.h -include /opt/wasi-sdk/share/wasi-sysroot/include/wasm32-wasi/__header_sys_stat.h -I/home/runner/work/test/test/stubs'
    optimize='-O2'
    cppflags='-lm -Wno-implicit-function-declaration -DNO_MATHOMS -D_WASI_EMULATED_PROCESS_CLOCKS -lwasi-emulated-process-clocks -D_WASI_EMULATED_GETPID -lwasi-emulated-getpid -D_LARGEFILE64_SOURCE -D_GNU_SOURCE -D_POSIX_C_SOURCE -DSTANDARD_C -DPERL_USE_SAFE_PUTENV -D_WASI_EMULATED_SIGNAL -lwasi-emulated-signal -Wno-null-pointer-arithmetic -fno-strict-aliasing -pipe -fstack-protector-strong -include /opt/wasi-sdk/share/wasi-sysroot/include/wasm32-wasi/fcntl.h -include /opt/wasi-sdk/share/wasi-sysroot/include/wasm32-wasi/__header_sys_stat.h -I/home/runner/work/test/test/stubs -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE -DNO_MATHOMS -Wno-implicit-function-declaration -D_WASI_EMULATED_PROCESS_CLOCKS -lwasi-emulated-process-clocks -D_WASI_EMULATED_GETPID -lwasi-emulated-getpid -D_GNU_SOURCE -D_POSIX_C_SOURCE -Wno-null-pointer-arithmetic -D_WASI_EMULATED_SIGNAL -lwasi-emulated-signal -include /opt/wasi-sdk/share/wasi-sysroot/include/wasm32-wasi/fcntl.h -include /opt/wasi-sdk/share/wasi-sysroot/include/wasm32-wasi/__header_sys_stat.h -I/home/runner/work/test/test/stubs'
    ccversion=''
    gccversion='Ubuntu Clang 14.0.0'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=1234
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='ld'
    ldflags ='-static -mllvm -wasm-enable-sjlj -lwasi-emulated-signal -lwasi-emulated-getpid -lwasi-emulated-process-clocks -lwasi-emulated-mman'
    libpth=/opt/wasi-sdk/share/wasi-sysroot/lib
    libs=-lm
    perllibs=-lm
    libc=
    so=so
    useshrplib=false
    libperl=libperl.a
    gnulibc_version='2.35'
  Dynamic Linking:
    dlsrc=dl_none.xs
    dlext=none
    d_dlsymun=undef
    ccdlflags=''
    cccdlflags=''
    lddlflags=''


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    NO_MATHOMS
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USES_PL_PIDSTATUS
    PERL_USE_DEVEL
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under wasi
  Compiled at Jan 16 2025 08:07:40
  @INC:
   /prefix/lib/perl5/5.41.7/wasm32-wasi

While the above configuration says 5.41.7, I've also tested on 5.40.0 and the issue persist

The only other relevant issue I could find was this one from 1999.

@andrewmd5
Copy link
Author

I "fixed" this by patching pp_sys.c - maybe there is a way to do it properly via more ceremony and magic invocations with the config and build flags, but this also works and I'm out of time to spend on this:

diff --git a/pp_sys.c b/pp_sys.c
index 5f8bb0d6ed..975e3a3877 100644
--- a/pp_sys.c
+++ b/pp_sys.c
@@ -2271,7 +2271,7 @@ PP_wrapped(pp_syswrite, 0, 1)
         if (MARK >= SP) {
             length = blen;
         } else {
-#if Size_t_size > IVSIZE
+#if Size_t_size > IVSIZE || defined(__wasi__)
             length = (Size_t)SvNVx(*++MARK);
 #else
             length = (Size_t)SvIVx(*++MARK);
@@ -2313,7 +2313,7 @@ PP_wrapped(pp_syswrite, 0, 1)
         goto say_undef;
     SP = ORIGMARK;
 
-#if Size_t_size > IVSIZE
+#if Size_t_size > IVSIZE || defined(__wasi__)
     PUSHn(retval);
 #else
     PUSHi(retval);
@@ -2418,7 +2418,7 @@ PP_wrapped(pp_tell, MAXARG, 0)
         RETURN;
     }
 
-#if LSEEKSIZE > IVSIZE
+#if LSEEKSIZE > IVSIZE || defined(__wasi__)
     PUSHn( (NV)do_tell(gv) );
 #else
     PUSHi( (IV)do_tell(gv) );
@@ -2433,7 +2433,7 @@ PP_wrapped(pp_sysseek, 3, 0)
 {
     dSP;
     const int whence = POPi;
-#if LSEEKSIZE > IVSIZE
+#if LSEEKSIZE > IVSIZE || defined(__wasi__)
     const Off_t offset = (Off_t)SvNVx(POPs);
 #else
     const Off_t offset = (Off_t)SvIVx(POPs);
@@ -2445,7 +2445,7 @@ PP_wrapped(pp_sysseek, 3, 0)
     if (io) {
         const MAGIC * const mg = SvTIED_mg((const SV *)io, PERL_MAGIC_tiedscalar);
         if (mg) {
-#if LSEEKSIZE > IVSIZE
+#if LSEEKSIZE > IVSIZE || defined(__wasi__)
             SV *const offset_sv = newSVnv((NV) offset);
 #else
             SV *const offset_sv = newSViv(offset);
@@ -2464,7 +2464,7 @@ PP_wrapped(pp_sysseek, 3, 0)
             PUSHs(&PL_sv_undef);
         else {
             SV* const sv = sought ?
-#if LSEEKSIZE > IVSIZE
+#if LSEEKSIZE > IVSIZE || defined(__wasi__)
                 newSVnv((NV)sought)
 #else
                 newSViv(sought)
@@ -2486,7 +2486,7 @@ PP_wrapped(pp_truncate, 2, 0)
     /* XXX Configure probe for the length type of *truncate() needed XXX */
     Off_t len;
 
-#if Off_t_size > IVSIZE
+#if Off_t_size > IVSIZE || defined(__wasi__)
     len = (Off_t)POPn;
 #else
     len = (Off_t)POPi;
@@ -3267,7 +3267,7 @@ PP_wrapped(pp_stat, !(PL_op->op_flags & OPf_REF), 0)
 #else
         PUSHs(newSVpvs_flags("", SVs_TEMP));
 #endif
-#if Off_t_size > IVSIZE
+#if Off_t_size > IVSIZE || defined(__wasi__)
         mPUSHn(PL_statcache.st_size);
 #else
         mPUSHi(PL_statcache.st_size);
@@ -3524,7 +3524,7 @@ PP(pp_ftis)
         dTARGET;
         switch (op_type) {
         case OP_FTSIZE:
-#if Off_t_size > IVSIZE
+#if Off_t_size > IVSIZE || defined(__wasi__)
             sv_setnv(TARG, (NV)PL_statcache.st_size);
 #else
             sv_setiv(TARG, (IV)PL_statcache.st_size);

@vadimkantorov
Copy link

vadimkantorov commented Jan 16, 2025

So either Size_t_size, LSEEKSIZE, Off_t_size or IVSIZE are badly determined and fail to detect 64-bitness of seek/tell functions (despite -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 in cflags in used config hints) ... Maybe worth pasting in pp_sys.c something like this:

#define VALUE_TO_STRING(x) #x
#define VALUE(x) VALUE_TO_STRING(x)
#define VAR_NAME_VALUE(var) #var "="  VALUE(var)
#pragma message(VAR_NAME_VALUE(Size_t_size))
#pragma message(VAR_NAME_VALUE(LSEEKSIZE))
#pragma message(VAR_NAME_VALUE(Off_t_size))
#pragma message(VAR_NAME_VALUE(IVSIZE))

Or maybe it determines these correctly and Size_t_size, LSEEKSIZE, Off_t_size or IVSIZE are all equal to 8 and thus do not fall into the NV branch...

@Leont
Copy link
Contributor

Leont commented Jan 16, 2025

The negative size (-1073741824) suggests we're hitting a 32-bit integer overflow, which is odd since the build is configured with both use64bitint=define and use64bitall=define. I've double-checked that ivsize is set to 8 and lseeksize is 8 in the config.

That is exactly what I wanted to recommend you when I had only read the title, so that is weird indeed.

What is the value of d_off64_t / HAS_OFF64_T?

Also, I'm not sure you should set use64bitall, I think you only need use64bitint.

@andrewmd5
Copy link
Author

andrewmd5 commented Jan 16, 2025

The negative size (-1073741824) suggests we're hitting a 32-bit integer overflow, which is odd since the build is configured with both use64bitint=define and use64bitall=define. I've double-checked that ivsize is set to 8 and lseeksize is 8 in the config.

That is exactly what I wanted to recommend you when I had only read the title, so that is weird indeed.

What is the value of d_off64_t / HAS_OFF64_T?

Also, I'm not sure you should set use64bitall, I think you only need use64bitint.

I tried with use64bitall to set undef prior to opening the issue (among a few dozen different combinations.) This is my current hint file:

	  # fix bizarre preprocessor bug
            
            d_perl_lc_all_category_positions_init='define'
            d_perl_lc_all_separator='undef'
            d_perl_lc_all_uses_name_value_pairs='undef'
            perl_lc_all_category_positions_init='{ 0, 1, 5, 2, 3, 4 }'
            perl_lc_all_separator=''
        
            usemymalloc="n"
            usedevel="y"
            usemultiplicity="undef"
            usenm='undef'
            usemallocwrap="define"
            d_procselfexe='undef'
            d_dlopen='undef'
            d_wait='undef'
            d_waitpid='undef'
            d_wait3='undef'
            d_wait4='undef'
            i_syswait='undef'
      
            i_grp='define'
            i_pwd='define'
            d_getpwnam='undef'
            d_getpwent='undef'
            d_getpwuid='undef' 
            d_getspnam='undef'
            d_getpwnam_r='undef'
            d_getpwent_r='undef'
            d_getpwuid_r='undef'
            d_getprpwnam='undef'
            d_setpwent='undef'
            d_setpwent_r='undef'
            d_getgrnam='undef'
            d_getgrgid='undef'
            d_getgrent='undef'
            d_getgrnam_r='undef'
            d_getgrgid_r='undef'
            d_getgrent_r='undef'
            d_setgrent='undef'
            d_setgrent_r='undef'
            d_endgrent='undef'
            d_endgrent_r='undef'
            d_getuid='undef'
            d_geteuid='undef'
            d_getgid='undef'
            d_getegid='undef'
        
            uselargefiles='define'
            use64bitint='define'
            useperlio='define'
            usequadmath='undef'
            usethreads='undef'
            use64bitall='define'

            d_off64_t='define'
            use_off64_t='define'
            d_stat='define'
            d_fstat='define'
            d_lstat='define'
            d_statblks='undef'
            d_fstat64='define'
            d_fdclose='undef'
            d_dirnamlen='undef'
            d_readdir64_r='define'

            sizesize='8'

            quadtype='long long'
            uquadtype='unsigned long long'
            quadkind='3'


            
            d_setrgid='undef'
            d_setruid='undef'
            d_setproctitle='undef'
            d_malloc_size='undef'
            d_malloc_good_size='undef'
        
            d_clearenv='undef'
            d_cuserid='undef'
            d_eaccess='undef'
            d_getspnam='undef'
            d_msgctl='undef'
            d_msgget='undef'
            d_msgrcv='undef'
            d_msgsnd='undef'
            d_semget='undef'
            d_semop='undef'
            d_shmat='undef'
            d_shmctl='undef'
            d_shmdt='undef'
            d_shmget='undef'
            d_syscall='undef'

            d_killpg='undef'
            d_pause='undef'
            d_wait4='undef'
            d_waitpid='undef'
            d_vfork='undef'
            d_pseudofork='undef'
            i_pthread='undef'
            d_pthread_atfork='undef'
            d_pthread_attr_setscope='undef'
            d_pthread_yield='undef'
            noextensions='Socket POSIX Time/HiRes'

It's entirely possible in the rapid testing of builds I ended up supplying values that conflict with each other. I added some additional logging and this is what is reported when I call stat:

=== File Size Configuration ===
#ifdef HAS_OFF64_T: 1
#ifdef USE_LARGE_FILES: 1
#ifdef NO_64_BIT_RAWIO: 0
#ifdef HAS_FSEEKO: 1
#ifdef USE_64_BIT_RAWIO: 1
FSEEKSIZE: 8
LSEEKSIZE: 8
Off_t_size: 8
IVSIZE: 8
Size_t_size: 8

For sizesize I also tried without this, but it made no difference. I'm assuming user error on my part, but if it seems like there might be a bug I'm happy to provide more information.

@Leont
Copy link
Contributor

Leont commented Jan 16, 2025

Just a wild guess, but what happens if you change that mPUSHi(PL_statcache.st_size); in pp_stat to mPUSHu(PL_statcache.st_size); instead of your proposed change.

@andrewmd5
Copy link
Author

andrewmd5 commented Jan 16, 2025

mPUSHu

That also worked, though I applied the same idea to seek and tell as those were problematic too and they remain broken (my prior patch fixed those too). Here is the patch from the test I just ran:

diff --git a/pp_sys.c b/pp_sys.c
index 5f8bb0d6ed..6dfb479e04 100644
--- a/pp_sys.c
+++ b/pp_sys.c
@@ -2274,7 +2274,7 @@ PP_wrapped(pp_syswrite, 0, 1)
 #if Size_t_size > IVSIZE
             length = (Size_t)SvNVx(*++MARK);
 #else
-            length = (Size_t)SvIVx(*++MARK);
+            length = (Size_t)SvUVx(*++MARK);
 #endif
             if ((SSize_t)length < 0) {
                 DIE(aTHX_ "Negative length");
@@ -2316,7 +2316,7 @@ PP_wrapped(pp_syswrite, 0, 1)
 #if Size_t_size > IVSIZE
     PUSHn(retval);
 #else
-    PUSHi(retval);
+    PUSHu(retval);
 #endif
     RETURN;
 
@@ -2421,7 +2421,7 @@ PP_wrapped(pp_tell, MAXARG, 0)
 #if LSEEKSIZE > IVSIZE
     PUSHn( (NV)do_tell(gv) );
 #else
-    PUSHi( (IV)do_tell(gv) );
+    PUSHu( (UV)do_tell(gv) );
 #endif
     RETURN;
 }
@@ -2436,7 +2436,7 @@ PP_wrapped(pp_sysseek, 3, 0)
 #if LSEEKSIZE > IVSIZE
     const Off_t offset = (Off_t)SvNVx(POPs);
 #else
-    const Off_t offset = (Off_t)SvIVx(POPs);
+    const Off_t offset = (Off_t)SvUVx(POPs);
 #endif
 
     GV * const gv = PL_last_in_gv = MUTABLE_GV(POPs);
@@ -2448,7 +2448,7 @@ PP_wrapped(pp_sysseek, 3, 0)
 #if LSEEKSIZE > IVSIZE
             SV *const offset_sv = newSVnv((NV) offset);
 #else
-            SV *const offset_sv = newSViv(offset);
+            SV *const offset_sv = newSVuv(offset);
 #endif
 
             return tied_method2(SV_CONST(SEEK), SP, MUTABLE_SV(io), mg, offset_sv,
@@ -2467,7 +2467,7 @@ PP_wrapped(pp_sysseek, 3, 0)
 #if LSEEKSIZE > IVSIZE
                 newSVnv((NV)sought)
 #else
-                newSViv(sought)
+                newSVuv(sought)
 #endif
                 : newSVpvn(zero_but_true, ZBTLEN);
             mPUSHs(sv);
@@ -2489,7 +2489,7 @@ PP_wrapped(pp_truncate, 2, 0)
 #if Off_t_size > IVSIZE
     len = (Off_t)POPn;
 #else
-    len = (Off_t)POPi;
+    len = (Off_t)POPu;
 #endif
     /* Checking for length < 0 is problematic as the type might or
      * might not be signed: if it is not, clever compilers will moan. */
@@ -3270,7 +3270,7 @@ PP_wrapped(pp_stat, !(PL_op->op_flags & OPf_REF), 0)
 #if Off_t_size > IVSIZE
         mPUSHn(PL_statcache.st_size);
 #else
-        mPUSHi(PL_statcache.st_size);
+        mPUSHu(PL_statcache.st_size);
 #endif
 #ifdef BIG_TIME
         mPUSHn(PL_statcache.st_atime);
@@ -3527,7 +3527,7 @@ PP(pp_ftis)
 #if Off_t_size > IVSIZE
             sv_setnv(TARG, (NV)PL_statcache.st_size);
 #else
-            sv_setiv(TARG, (IV)PL_statcache.st_size);
+            sv_setiv(TARG, (UV)PL_statcache.st_size);
 #endif
             break;
         case OP_FTMTIME:

I might have done something incorrect though, so I can look with fresh eyes later.

@Leont
Copy link
Contributor

Leont commented Jan 16, 2025

This may be this issue. off_t is supposed to be a signed type (per POSIX), but apparently in wasm it's unsigned? Possibly overriding lseektype to a signed 64 bit integer helps (you may have to explicitly cast that PL_statcache.st_size in pp_stat too).

That said, that doesn't explain the the 32-bit part of the issue. My best guess is that it's because of sizesize='8', making perl believe size_t is 64 bits instead of 32. You probably shouldn't mess with the detected value.

@Leont
Copy link
Contributor

Leont commented Jan 16, 2025

I might have done something incorrect though, so I can look with fresh eyes later.

The return values of pp_syswrite, pp_tell, pp_sysseek and pp_truncate should definitely be signed (because they need to be able to return -1), and that last set_iv makes more sense as a set_uv. But I suspect that patch is the wrong route anyway.

@andrewmd5
Copy link
Author

Thanks @Leont. I removed sizesize and applied the below patched and it worked:

diff --git a/pp_sys.c b/pp_sys.c
index 5f8bb0d6ed..ad8e46e692 100644
--- a/pp_sys.c
+++ b/pp_sys.c
@@ -3270,7 +3270,7 @@ PP_wrapped(pp_stat, !(PL_op->op_flags & OPf_REF), 0)
 #if Off_t_size > IVSIZE
         mPUSHn(PL_statcache.st_size);
 #else
-        mPUSHi(PL_statcache.st_size);
+        mPUSHu(PL_statcache.st_size);
 #endif
 #ifdef BIG_TIME
         mPUSHn(PL_statcache.st_atime);
@@ -3527,7 +3527,7 @@ PP(pp_ftis)
 #if Off_t_size > IVSIZE
             sv_setnv(TARG, (NV)PL_statcache.st_size);
 #else
-            sv_setiv(TARG, (IV)PL_statcache.st_size);
+            sv_setuv(TARG, (UV)PL_statcache.st_size);
 #endif
             break;
         case OP_FTMTIME:

@Leont
Copy link
Contributor

Leont commented Jan 17, 2025

Can you look up the type of struct stat's st_size? Is it signed or unsigned? Is it off_t or something else.

@vadimkantorov
Copy link

vadimkantorov commented Jan 17, 2025

In sys/stat.h - off_t st_size, so in theory it should be signed 64bit

@Leont
Copy link
Contributor

Leont commented Jan 17, 2025

In sys/stat.h - off_t st_size, so in theory it should be signed 64bit

The behavior we're observing is strongly suggesting the type of st_size is unsigned, probably even 32 bit unsigned. Why else would things start working when we used unsigned types like UV instead of IV?

We're already detecting the signedness of st_dev and st_ino (e.g. perl -V:st_dev_sign), I'm suspecting the correct solution is to do the same for st_size

@vadimkantorov
Copy link

Why else would things start working

I'm also curious of the cause, but st_size indeed defined as off_t which is a signed type https://github.com/WebAssembly/wasi-libc/blob/main/libc-bottom-half/headers/public/__struct_stat.h#L25

@andrewmd5
Copy link
Author

andrewmd5 commented Jan 17, 2025

In sys/stat.h - off_t st_size, so in theory it should be signed 64bit

The behavior we're observing is strongly suggesting the type of st_size is unsigned, probably even 32 bit unsigned. Why else would things start working when we used unsigned types like UV instead of IV?

We're already detecting the signedness of st_dev and st_ino (e.g. perl -V:st_dev_sign), I'm suspecting the correct solution is to do the same for st_size

st_dev_sign is 1 when I run that, and everything else showed up as unknown. To add to the confusion, further testing with the previous patch it still seems to support file sizes exceeding UINT_MAX. From the WASI host I’m definitely writing an 64 bit signed integer on the stat struct, as it’s defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants