Simplify – Generic Android Deobfuscator

Simplify nearly executes an app to know its conduct after which tries to optimize the code in order that it behaves identically however is more uncomplicated for a human to know. Each optimization sort is unassuming and generic, so no matter what the precise form of obfuscation is used.
digital system sandbox for executing Dalvik strategies. After executing a technique, it returns a graph containing all conceivable check in and sophistication values for each execution trail. It works despite the fact that some values are unknown, corresponding to report and community I/O. For instance, any if or transfer conditional with an unknown price leads to each branches being taken.

  • simplify: Analyzes the execution graphs from smalivm and applies optimizations corresponding to consistent propagation, useless code elimination, unreflection, and a few peephole optimizations. These are moderately easy, but if carried out in combination many times, they are going to decrypt strings, eradicate mirrored image, and a great deal simplify code. It does now not rename strategies and categories.
  • demoapp: Contains easy, closely commented examples for the usage of smalivm on your personal venture. If you are development one thing that should execute Dalvik code, test it out.
  • Usage

    utilization: java -jar simplify.jar  [options]
    deobfuscates a dalvik executable
    -et,--exclude-sorts Exclude categories and techniques which come with REGEX, eg: "com/android", carried out after consist of-sorts
    -h,--help Display this message
    -ie,--ignore-mistakes Ignore mistakes whilst executing and optimizing strategies. This might result in sudden conduct.
    --include-enhance Attempt to execute and optimize categories in Android enhance library programs, default: false
    -it,--include-sorts Limit execution to categories and techniques which come with REGEX, eg: ";->goalMethod("
    --max-cope with-visits Give up executing a technique after visiting the similar cope with N occasions, limits loops, default: 10000
    --max-name-intensity Do now not name strategies after achieving a decision intensity of N, limits recursion and lengthy approach chains, default: 50
    --max-executi on-time Give up executing a technique after N seconds, default: 300
    --max-approach-visits Give up executing a technique after executing N directions in that approach, default: 1000000
    --max-passes Do now not run optimizers on a technique greater than N occasions, default: 100
    -o,--output Output simplified enter to FILE
    --output-api-degree Set output DEX API compatibility to LEVEL, default: 15
    -q,--quiet Be quiet
    --remove-vulnerable Remove code despite the fact that there are vulnerable uncomfortable side effects, default: true
    -v,--verbose Set verbosity to LEVEL, default: 0

    Building
    Building calls for the Java Development Kit 8 (JDK) to be put in.
    Because this venture comprises submodules for Android frameworks, both clone with --recursive:

    git clone --recursive https://github.com/CalebFenton/simplify.git

    Or replace submodules at any time with:

    git submodule replace --init --recursive

    Then, to construct a unmarried jar which comprises all dependencies:

    ./gradlew fatjar

    The Simplify jar shall be in simplify/construct/libs/. You can check it is running via simplifying the equipped obfuscated instance app. Here’s how you’ll run it (it’s possible you’ll wish to exchange simplify.jar):

    java -jar simplify/construct/libs/simplify.jar -it 'org/cf/obfuscated' -et 'MainActivity' simplify/obfuscated-app.apk

    To perceive what is getting deobfuscated, take a look at Obfuscated App’s README.

    Troubleshooting
    If Simplify fails, check out those suggestions, so as:

    1. Only goal a couple of strategies or categories via the usage of -it choice.
    2. If failure is on account of most visits exceeded, check out the usage of upper --max-cope with-visits, --max-name-intensity, and --max-approach-visits.
    3. Try with -v or -v 2 and document the problem with the logs and a hash of the DEX or APK.
    4. Try once more, however don’t wreck eye touch. Simplify can sense worry.

    If development on Windows, and development fails with an error very similar to:

    Could now not to find equipment.jar. Please take a look at that C:Program InformationJavajre1.8.0_151 comprises a legitimate JDK set up.

    This way Gradle is not able to discover a right kind JDK trail. Make certain the JDK is put in, set the JAVA_HOME atmosphere variable for your JDK trail, and be sure you shut and re-open the command suggested you utilize to construct.

    Contributing
    Don’t be shy. I believe digital execution and deobfuscation are interesting issues. Anyone who is is routinely cool and contributions are welcome, despite the fact that it is simply to mend a typo. Feel unfastened to invite questions within the problems and post pull requests.

    Reporting Issues
    Please consist of a hyperlink to the APK or DEX and the whole command you are the usage of. This makes it a lot more uncomplicated to breed (and thus repair) your factor.
    If you’ll’t percentage the pattern, please consist of the report hash (SHA1, SHA256, and so forth).

    Optimization Strategies

    Constant Propagation
    If an op puts a price of a kind which will also be became a relentless corresponding to a string, quantity, or boolean, this optimization will exchange that op with the consistent. For instance:

    const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
    invoke-static v0, Lmy/string/Decryptor;->decrypt(Ljava/lang/String;)Ljava/lang/String;
    # Decrypts to: "Tell me of your homeworld, Usul."
    transfer-outcome v0

    In this case, an encrypted string is decrypted and positioned into v0. Since strings are “constantizable”, the transfer-outcome v0 will also be changed with a const-string:

    const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
    invoke-static v0, Lmy/string/Decryptor;->decrypt(Ljava/lang/String;)Ljava/lang/String;
    const-string v0, "Tell me of your homeworld, Usul."

    Dead Code Removal
    Code is useless if disposing of it can’t most likely regulate the conduct of the app. The most blatant case is that if the code is unreachable, e.g. if (false) ). If code is reachable, it can be thought to be useless if it does not have an effect on any state out of doors of the process, i.e. it has no aspect impact. For instance, code would possibly not have an effect on the go back price for the process, regulate any elegance variables, or carry out any IO. This is a troublesome to resolve in static research. Luckily, smalivm does not need to be artful. It simply stupidly executes the entirety it will possibly and assumes there are uncomfortable side effects if it cannot be certain. Consider the instance from Constant Propagation:

    const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
    invoke-static v0, Lmy/string/Decryptor;->decrypt(Ljava/lang/String;)Ljava/lang/String;
    const-string v0, "Tell me of your homeworld, Usul."

    In this code, the invoke-static not impacts the go back price of the process and shall we say it does not do the rest bizarre like write bytes to the report gadget or a community socket so it has no uncomfortable side effects. It can merely be got rid of.

    const-string v0, "VGVsbCBtZSBvZiB5b3VyIGhvbWV3b3JsZCwgVXN1bC4="
    const-string v0, "Tell me of your homeworld, Usul."

    Finally, the primary const-string assigns a price to a check in, however that price is rarely used, i.e. the task is useless. It can be got rid of.

    const-string v0, "Tell me of your homeworld, Usul."

    Huzzah!

    Unreflection
    One main problem with static research of Java is mirrored image. It’s simply now not conceivable to understand the arguments are for mirrored image strategies with out doing cautious information waft research. There are sensible, artful tactics of doing this, however smalivm does it via simply executing the code. When it unearths a mirrored approach invocation corresponding to:

    invoke-digital , Ljava/lang/replicate/Method;->invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;

    It can know the values of v0, v1, and v2. If it is certain what the values are, it will possibly exchange the decision to Method.invoke() with a real non-mirrored approach invocation. The identical applies for mirrored box and sophistication lookups.

    Peephole
    For the entirety that does not are compatible cleanly into a selected class, there may be peephole optimizations. This contains disposing of needless take a look at-solid ops, changing Ljava/lang/String;-> calls with const-string, and so forth.

    Deobfuscation Example

    Before Optimization

    .approach public static test1()I
    .locals 2

    new-example v0, Ljava/lang/Integer;
    const/4 v1, 0x1
    invoke-direct v0, v1, Ljava/lang/Integer;->(I)V

    invoke-digital v0, Ljava/lang/Integer;->intValue()I
    transfer-outcome v0

    go back v0
    .finish approach

    All this does is v0 = 1.

    After Constant Propagation

    .approach public static test1()I
    .locals 2

    new-example v0, Ljava/lang/Integer;
    const/4 v1, 0x1
    invoke-direct v0, v1, Ljava/lang/Integer;->(I)V

    invoke-digital v0, Ljava/lang/Integer;->intValue()I
    const/4 v0, 0x1

    go back v0
    .finish approach

    The transfer-outcome v0 is changed with const/4 v0, 0x1. This is as a result of there’s most effective one conceivable go back price for intValue()I and the go back sort will also be made a relentless. The arguments v0 and v1 are unambiguous and don’t exchange. That is to mention, there is a consensus of values for each conceivable execution trail at intValue()I. Other sorts of values that may be became constants:

    • numbers – const/4, const/16, and so forth.
    • strings – const-string
    • categories – const-elegance

    After Dead Code Removal

    .approach public static test1()I
    .locals 2

    const/4 v0, 0x1

    go back v0
    .finish approach

    Because the code above const/4 v0, 0x1 does now not have an effect on state out of doors of the process (no aspect-results), it may be got rid of with out converting conduct. If there used to be a technique name that wrote one thing to the report gadget or community, it could not be got rid of as it impacts state out of doors the process. Or if check()I took a mutable argument, corresponding to a LinkedList, any directions that accessed it could not be thought to be useless.
    Other examples of useless code:

    • unreferenced assignments – assigning registers and now not the usage of them
    • unreached / unreachable directions – if (false)

    Further Reading

    Download Simplify

    Published by Marshmallow

    Marshmallow Android is BT Ireland’s Head of Sales for Republic of Ireland domestic multi-site companies, indigenous MNCs and public sector accounts. He is responsible for the direction and control of all sales activity in the region. He has over 10 years management experience from high growth start-ups to more established businesses. He’s led teams in Ireland, India and China across various industries (ICT, On-Line Recruitment, Corporate Training and International Education).