Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bindgen: use bindgen to provide Rust bindings to C - v6 #12451

Closed
wants to merge 9 commits into from

Conversation

jasonish
Copy link
Member

[Essentially the same as #12450 but with inter-commit builds fixed]

Previous work:

  • bindgen: use bindgen to provide Rust bindings to C - v3 #12062: This first PR used the bindgen library as a build-time plugin. This requires that clang/clang-devel be available to all builders of Suricata. Additionally there was some path issues on Windows where Rust a cygpath have very different ideas of what a path should look like and I was unable to get it to work on Windows. However, its the most "seamless" way.

  • bindgen-cli: alternate use of bindgen - v1 #12139: This converted to bindgen-cli to generate the bindings. This lets us control it from Makefiles. There are issues like dependency tracking to automatically rebuild the bindings and such. And it also required clang/clang-devel type packages to be install.

Changes from previous work:

  • This PR builds on the bindgen-cli approach. However we commit the generated bindings.
  • People who just want to build Suricata don't need to worry about the bindgen and related tools.
  • We add a CI check that checks if the bindings are update to date
  • Make check will also check, assuming bindgen is installed
  • It would be expected that developers have bindgen and related tools installed.

This is just the very beginnings of using bindgen effectively, but once foundation work like this is commited, it becomes more about header cleanup, ideally to the point where we have no manually maintained extern "C" code in our Rust.

Ticket: https://redmine.openinfosecfoundation.org/issues/7341

Add a minimal integration of bindgen to the build.

This required some refactoring of the C so app-layer-events.h did not
also include "rust.h", which causes issues for bindgen, probably
related to circular references.

AppLayerEventType was chosen as the first step as its an argument type
some app-layer functions that we may want to use bindgen to export
Rust, and one of the requirements of bindgen might be that C functions
should only use datatypes defined in C, and not Rust. Following such a
rule also prevents circular dependencies between Rust and C code.

As bindgen can only accept one header file, we construct a
pseudo-header file of all the headers that need to be
consumed. Makefile dependency checking is done to make sure the
pseudo-header is only generated as needed to avoid rebuilding every
time.

Special handling is required for Windows to use the Windows path.

Ticket: OISF#7341

bindgen: pre-generate rust bindings to C

This simplifies building, as we don't have to worry about path and
such under autoconf/automake. It does however mean when we update C
headers that are exposed to Rust, we manually have to re-generate the
bindings, but this is a check we can do in CI.

It also eliminates the need for everyone who wants to build Suricata
to have bindgen and clang tools installed.
This lets us remove decode.h from app-layer-events.h as pulling in
app-layer-events.h shouldn't result in pulling in dpdk, and other
includes not related to app-layer-events.

decode.h also doesn't need those forward declarations anymore due to
previous changes.
Instead of defining this function pointer in type in Rust, and having
it in C signatures, create a type and export it to Rust.

To facilitate this, and new header has been creates,
"app-layer-types.h", this is to avoid the circular reference of C
headers pulling in "rust.h" which are required to generate Rust
bindings.
This exposes the C define ALPROTO values to Rust without having to
perform some runtime initialization with init_ffi.

As app-layer-protos.h was clean of a circular reference to rust.h we
could use it directly, it just needed the addition of
suricata-common.h.
This required us to remove the auto-generated by header as it can
change depending on what version is used.

I suppose different versions could generate slightly different output
that would cause this error out. In which case we may want to use a
custom script that can be a bit smarter output, or simply error out if
any of the headers used in the bindings is newer than the generated
_sys.rs file.
If we have bindgen, and we're building in-tree, rebuild the bindings
and compare to the previous bindings and print a warning if there is a
difference.

Only a warning for now, as I'm not sure where this might fail, and I
don't want it to become a nuisance.
Copy link

codecov bot commented Jan 22, 2025

Codecov Report

Attention: Patch coverage is 72.72727% with 9 lines in your changes missing coverage. Please review.

Project coverage is 80.64%. Comparing base (95e8427) to head (2a4b878).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12451      +/-   ##
==========================================
+ Coverage   80.63%   80.64%   +0.01%     
==========================================
  Files         920      921       +1     
  Lines      258704   258723      +19     
==========================================
+ Hits       208595   208646      +51     
+ Misses      50109    50077      -32     
Flag Coverage Δ
fuzzcorpus 56.83% <60.00%> (+0.02%) ⬆️
livemode 19.39% <13.33%> (-0.01%) ⬇️
pcap 44.28% <46.66%> (-0.05%) ⬇️
suricata-verify 63.25% <100.00%> (-0.02%) ⬇️
unittests 58.51% <53.12%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Comment on lines +18 to +21
// We do this as an include so we can allow non_camel_case_types.
#![allow(non_camel_case_types)]
#![allow(non_snake_case)]
include!("_sys.rs");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After talking to @catenacyber I think I'll move the sys stuff to the more idiomatic Rust use of a _sys crate. So suricata_sys will be a crate that only contains C bindings, and not depend on the suricata crate itself.

I guess it would be nice if it could contain the "C" functions written in Rust as well.

Comment on lines +115 to +122
--allowlist-type 'SCAppLayerEventType' \
--rustified-enum 'SCAppLayerEventType' \
--allowlist-type 'SCAppLayerStateGetEventInfoByIdFn' \
--allowlist-type 'AppProto' \
--allowlist-type 'AppProtoEnum' \
--rustified-enum 'AppProtoEnum' \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is restrictive for now, but we could probably expand to SC.*. Then if something else needs to be expose, like AppProto, its probably a signed it needs the prefix :)

--allowlist-type 'AppProto' \
--allowlist-type 'AppProtoEnum' \
--rustified-enum 'AppProtoEnum' \
--allowlist-function 'jb_.*' \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Refactor the C JsonBuilder API to follow our C conventions... ie) SCJbSetString, etc...

@jasonish
Copy link
Member Author

Previous versions were a draft and somewhat experimental, but I think we need to go this way to be "type safe", so no longer considering this draft or experimental. And of the 3 approaches tries, this is my favorite so far.

@jasonish
Copy link
Member Author

Previous versions were a draft and somewhat experimental, but I think we need to go this way to be "type safe", so no longer considering this draft or experimental. And of the 3 approaches tries, this is my favorite so far.

Well, once I move the sys module to a suricata_sys crate that is.

@suricata-qa
Copy link

Information: QA ran without warnings.

Pipeline 24322

@suricata-qa
Copy link

ERROR:

ERROR: QA failed on build_asan.

Pipeline 24323

This is an example of how we can use cbindgen -> C for "C" style
functions written in Rust, and then back to Rust with bindgen. The idea
is to build one Rust module that contains all the C bindings, whether or
not the functions are written in Rust or C.
@suricata-qa
Copy link

Information: QA ran without warnings.

Pipeline 24326

Not supported on 0.69.5 as found in many package managers.
@jasonish
Copy link
Member Author

Replaced by #12461.

@jasonish jasonish closed this Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants