-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed the native crash happening on Android. #116
base: main
Are you sure you want to change the base?
Conversation
* The issue on the Android side was a `null pointer dereference`, meaning that when accessing the getEntryByPath method, the archive object had already been disposed of, which caused the crash. * We are correctly handling and disposing of the previous archive and web views for the previously set archive. However, web views load content on their own thread. When we change the ZIM file, we dispose of the previous archive and stop the ongoing web view processes. In certain cases, the web view internally tries to get content via the getEntryByPath method because it takes some time to cancel all the previous calls on the background thread. As a result, the web view attempts to call the method on a disposed archive, which causes the native crash. * To fix this, I refactored the getPtr method to return nullptr if the archive has already been disposed of. I also handled this in the utils.h class by checking if the object is null (meaning it has already been disposed). If so, an error is thrown (which should be handled by the caller, in this case, the Kiwix app), instead of crashing the program. * Handling this scenario on the Android side is not feasible since the WebView operates on a separate thread over which we have no control. However, addressing this issue on the java-libkiwix side is a better approach. If an object is already disposed, it should throw an error that can be caught by the caller, instead of crashing the entire program. Handling such errors is the responsibility of the user of `java-libkiwix` (e.g., Kiwix Android). We are already handling all errors thrown by `java-libkiwix` in the Kiwix Android app. * Currently, I have created the `SAFE_THIS` macro to gather feedback on this approach. Changing the existing `THIS` macro to incorporate the new behavior would require updating all the places where it is being used. To validate this fix before making those changes, I implemented the new function separately. I explicitly disposed of the archive and attempted to access its content via the getEntryByPath method. Now, instead of crashing the entire program, it correctly throws an error, which is handled by Kiwix.
@kelson42, @mgautierfr Can you please review the PR and validate the idea? then I will refactore the remaining code. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #116 +/- ##
=========================================
Coverage 93.23% 93.23%
Complexity 237 237
=========================================
Files 47 47
Lines 325 325
Branches 3 3
=========================================
Hits 303 303
Misses 19 19
Partials 3 3 ☔ View full report in Codecov by Sentry. |
({ \ | ||
auto ptr = getPtr<NATIVE_TYPE>(env, thisObj); \ | ||
if (!ptr) { \ | ||
throwException(env, "java/lang/IllegalStateException", "The native object has already been disposed."); \ | ||
return nullptr; \ | ||
} \ | ||
ptr; \ | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see how this could work in the pre-C++-14 language standard that I am more or less familiar with. SAFE_THIS
has to be an expression but it doesn't seem to expand to code that can be considered an expression according to the language specification. I tried to quickly skim through https://en.cppreference.com/w/cpp/language/expressions but couldn't find anything that allows to consider this code valid in modern c++.
In other words, I am puzzled how this code compiles (in the context of the change in lib/src/main/cpp/libzim/archive.cpp
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@veloman-yunkan This code is made by @mgautierfr, he can better explain this, and i am not very much familiar with C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that @veloman-yunkan understand how it works :)
#define FOO BAR
make every instance of FOO
literally replaced by BAR
.
So something like THIS->getEntryByPath(...)
is replaced by getPtr<NATIVE_TYPE>(env, thisObj)->getEntryByPath(...)
. And it compiles because getPtr
is a function returning a pointer.
With your define of SAFE_THIS
we have SAFE_THIS->getEntryByPath(...)
replaced by:
({
auto ptr = getPtr<NATIVE_TYPE>(env, thisObj);
if (!ptr) {
throwException(env, "java/lang/IllegalStateException", "The native object has already been disposed.");
return nullptr;
}
ptr;
})->getEntryByPath(...)
And, as @veloman-yunkan, I don't know how this code can be compiled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With your define of SAFE_THIS we have SAFE_THIS->getEntryByPath(...) replaced by:
@mgautierfr SAFE_THIS
also uses the getPtr<NATIVE_TYPE>(env, thisObj)
to get the pointer, and if this method returns the nullptr
it throws the error which will be caught in the archive.cpp
.
METHOD(jobject, getEntryByPath__Ljava_lang_String_2, jstring path) {
return BUILD_WRAPPER("org/kiwix/libzim/Entry", SAFE_THIS->getEntryByPath(TO_C(path)));
} CATCH_EXCEPTION(nullptr)
Also, in the archive.cpp SAFE_THIS
returns the archive.
I think that's why this code was compiled without any error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SAFE_THIS
is not a function. It is a macro (ie. textual replacement)
It appears that we are facing a statement expression (https://stackoverflow.com/questions/8344101/what-is-it-called-when-a-block-returns-a-value) which is not a standard C++ code but a gcc extension.
This explain why it works. However, I not really confident to use a non standard c++ here, especially as we have a better alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh okay got it. @mgautierfr So we can remove this SAFE_THIS
as it is non-standard. In this PR I have updated the getPtr
method if that is null it returns the nullptr
which we can handle before calling the functions on this.
e.g. The updated getPtr
method is:
// Return a shared_ptr for the handle
template<typename T>
shared_ptr<T> getPtr(JNIEnv* env, jobject thisObj, const char* handleName = "nativeHandle")
{
jclass thisClass = env->GetObjectClass(thisObj);
jfieldID fidNumber = env->GetFieldID(thisClass, handleName, "J");
auto handle = reinterpret_cast<shared_ptr<T>*>(env->GetLongField(thisObj, fidNumber));
if (handle == nullptr) {
return nullptr;
}
return *handle;
}
and we are getting this like this in archive.cpp
:
METHOD(jobject, getEntryByPath__Ljava_lang_String_2, jstring path) {
if (THIS == nullptr) {
throwException(env, "java/lang/IllegalStateException", "The Archive object has already been disposed.");
return nullptr;
}
return BUILD_WRAPPER("org/kiwix/libzim/Entry", THIS->getEntryByPath(TO_C(path)));
} CATCH_EXCEPTION(nullptr)
and will update the other code to use this approach. By using this approach we can handle the native crash for now without upgrading the minimum SDK version. If you have a better approach to not writing this check in all places, something like extension function that avoids writing this check everywhere please let me know.
In the screenshot: I have compiled the code with this new approach and tried to explicitly dispose the archive and then tried to access the content on the disposed archive and instead of crashing the program, it normally throws the exception. That we are already handling in Kiwix app.
I am not in favor of upgrading the minimum SDK to 33
because we will lose most of the devices, SDK 33 means Android 13
which is one of the most recent versions of Android.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are correctly handling and disposing of the previous archive and web views for the previously set archive.
I would say that you are not correctly disposing the previous archive if you are disposing it while something is still using it.
However, I understand it can be difficult to do such memory management in a java world where everything is expected to be managed by GC.
I have a clean solution for this problem and it is pr #25. The PR remove the need to call dispose on libzim object (and even remove this method) and make resource automatically clean up when GC destroy the object.
@mgautierfr You are right, and I mentioned this thing in the PR description please see point 4(We have done all possible things to remove loading of webView at the Android level).
Could we complete that PR so we get rid of this error? |
The PR is complete (at least at the time of submission, we may have few merge conflict to fix). |
@mgautierfr I am not in favor of upgrading the minimum SDK version to 33 since it drops the compatibility for most of the devices. Since SDK 33 means Android 13 which is a very new device which is few users using this and the above devices, and most of the users still using the old devices like Android 12, Android 11, etc. Is there any other alternative to use the new PR without upgrading the minimum SDK version? If currently not, what do you think about my this proposal #116 (comment)? |
See issue kiwix/kiwix-android#3956
null pointer dereference
, meaning that when accessing the getEntryByPath method, the archive object had already been disposed of, which caused the crash.java-libkiwix
(e.g., Kiwix Android). We are already handling all errors thrown byjava-libkiwix
in the Kiwix Android app.SAFE_THIS
macro to gather feedback on this approach. Changing the existingTHIS
macro to incorporate the new behavior would require updating all the places where it is being used. To validate this fix before making those changes, I implemented the new function separately. I explicitly disposed of the archive and attempted to access its content via the getEntryByPath method. Now, instead of crashing the entire program, it correctly throws an error, which is handled by Kiwix.The issue sometimes occurs on the CI see https://github.com/kiwix/kiwix-android/actions/runs/12809338048/job/35713951704?pr=4177#step:8:9560