Discussion:
Java Strings via SWIG in Android NDK have strange characters in place of null bytes
Mike Davies
2014-10-09 08:33:19 UTC
Permalink
Hi,

I also posted this question elsewhere but as it most directly relates to
SWIG I hope I can also post it here.

I am using SWIG to generate an interface to some C code which I have
compiled as an Android NDK library. My C code NDK library uses a
structure, MY_STRUCT that contains char* elements that pass data in both
the in and out directions and the SWIG generated JNI works fine as far as
that goes, ie I can read the strings in the C code, set data and read the
result in the Java code as required.

However I have a problem in that if I pass in a java String that contains
null bytes the nulls are replaced by "\300\200"

Eg, if I create a string in java as follows :

MY_STRUCT myStruct = new MY_STRUCT();
byte[] myBytes = new byte[21];
String myString = new String(myBytes);
myStruct.setName(myString);

then myStruct has it's name field set to 21 null bytes as required and this
is visible in the java debugger, but the string passed across to my NDK
library code as seen in the NDK C debugger is as follows :


"\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200"

My SWIG .i file has the following relevant portion :

%include various.i
%include cpointer.i
%pointer_functions(int, intp);

%include ../../../mydir/myheader.h

myheader.h has the following relevant portion :

typedef struct
{
...
char* name;
...
} *P_MY_STRUCT, MY_STRUCT;

The C code debugs fine and I can read and write all the strings pointed to
by the name etc elements of MY_STRUCT, the only issue is the transformation
of null elements of the strings input to the SWIG generated JNI code into
the strange "\300\200" elements in the C code in the NDK library.
From an answer I received elsewhere I understand that this behavious is due
to the fact that JNI translates java strings into "Modified UTF-8" strings
where the nulls are transcoded to two-byte 0x11000000 ox10000000 form.

My question is : is there any way around this in SWIG ? I need to be able
to tell whether my passed in string is empty and strlen etc return the
wrong result because of the unusual encoding.

ALTERNATIVELY :

I have several functions that take byte arrays instead of strings for
C-function char* arguments and this is achieved in the myModule.i file as
follows :

bool myFunc(P_MY_STRUCT, char* BYTE, int32_t);

Is there any way in SWIG of achieving the equivalent of the BYTE
functionality for structure members ? I tried using the same BYTE trick in
myModule.i as follows but it didn't work :

typedef struct
{
...
char* BYTE;
...
} *P_MY_STRUCT, MY_STRUCT;

Is there another method of using SWIG to pass java byte[] arrays instead of
Strings in the MY_STRUCT fields above ?

Thanks,

Mike
William S Fulton
2014-12-06 22:34:53 UTC
Permalink
Post by Mike Davies
Hi,
I also posted this question elsewhere but as it most directly relates to
SWIG I hope I can also post it here.
I am using SWIG to generate an interface to some C code which I have
compiled as an Android NDK library. My C code NDK library uses a
structure, MY_STRUCT that contains char* elements that pass data in both
the in and out directions and the SWIG generated JNI works fine as far as
that goes, ie I can read the strings in the C code, set data and read the
result in the Java code as required.
However I have a problem in that if I pass in a java String that contains
null bytes the nulls are replaced by "\300\200"
MY_STRUCT myStruct = new MY_STRUCT();
byte[] myBytes = new byte[21];
String myString = new String(myBytes);
myStruct.setName(myString);
then myStruct has it's name field set to 21 null bytes as required and
this is visible in the java debugger, but the string passed across to my
"\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200"
%include various.i
%include cpointer.i
%pointer_functions(int, intp);
%include ../../../mydir/myheader.h
typedef struct
{
...
char* name;
...
} *P_MY_STRUCT, MY_STRUCT;
The C code debugs fine and I can read and write all the strings pointed to
by the name etc elements of MY_STRUCT, the only issue is the transformation
of null elements of the strings input to the SWIG generated JNI code into
the strange "\300\200" elements in the C code in the NDK library.
From an answer I received elsewhere I understand that this behavious is
due to the fact that JNI translates java strings into "Modified UTF-8"
strings where the nulls are transcoded to two-byte 0x11000000 ox10000000
form.
My question is : is there any way around this in SWIG ? I need to be able
to tell whether my passed in string is empty and strlen etc return the
wrong result because of the unusual encoding.
I have several functions that take byte arrays instead of strings for
C-function char* arguments and this is achieved in the myModule.i file as
bool myFunc(P_MY_STRUCT, char* BYTE, int32_t);
Is there any way in SWIG of achieving the equivalent of the BYTE
functionality for structure members ? I tried using the same BYTE trick in
typedef struct
{
...
char* BYTE;
...
} *P_MY_STRUCT, MY_STRUCT;
Is there another method of using SWIG to pass java byte[] arrays instead
of Strings in the MY_STRUCT fields above ?
There are some typemaps for this in various.i. Also have a good read of
http://swig.org/Doc3.0/Java.html and look at Examples/java/typemaps
example.

William

Loading...