To what extent can we analyze Kotlin programs using existing Java

arXiv:2207.09379v2 [cs.PL] 29 Jul 2022

To what extent can we analyze Kotlin programs

using existing Java taint analysis tools?

(Extended Version)

Ranjith Krishnamurthy

Fraunhofer IEM

ranjith.krishnamurth[email protected].de

Goran Piskachev

Fraunhofer IEM

goran.[email protected].de

Eric Bodden

Paderborn University & Fraunhofer IEM

eric.bodden@uni-paderborn.de

Abstract—As an altern ative to Java, Kotlin has gained rapid

popularity since its introduction and has become the default

choice for developing Android apps. However, due to its inter-

operability with Java, Kotl in programs may contain almost the

same security vulnerabilities as their Java counterparts. Hence,

we question: to what extent can one use an existing Java static

taint analysis on Kotlin code? In this paper, we investigate the

challenges in implementing a taint analysis for Kotlin compared

to Java. To answer this question, we performed an exploratory

study where each Kotlin construct was examined and compared

to its Java equivalent. We identiﬁed 18 engineering challenges

that static-analysis writers need to handle differently due to

Kotlin’s uniqu e constructs or the differences in the generated

bytecode between the Kotlin and Java compilers. For eight of

them, we provide a conceptual solution, while six of those we

implemented as part of SECUCHECK-KOTLIN, an extension to

the existing Java taint analysis SECUCHECK.

Index Terms—static analysis, security, kotlin, taint analysis

I. INTRODUCTION

Ten years since its introduction, Kotlin h as been one of

the fastest-growing programming languages (PLs). As of June

2022, it is the twelfth most popular PL by the PYPL index

Additionally, over 60% of the Android apps are written in

Kotlin, earning it the title of the default PL for the Android

framework

. One of the Kotlin advantages as a JVM-b a sed PL

is its interoperability with Java and its unique constructs like

data classes, coroutines, n ull safety, extensio ns, etc.

Like Java, Kotlin code may be vulnerable to security vul-

nerabilities, such as SQL inje c tion [

1]. Therefore, statically

analyzing Kotlin code can be a helpful method for detecting

bugs and security v ulnerabilities as early as possible. Despite

its popularity, very few static-analysis tools can analyze Kotlin

code, such as KtLint [

2], D e te kt [3], Diktat [4], and Sonar-

Qube [

5]. These tools only perform pattern-ba sed analyses

using simple rules, such as the rules of Son arQube [6]. We are

not aware of any tool that performs deep data-ﬂow analyses on

Kotlin code. For example, taint analysis has proven to be very

useful for detecting many prevalent security vulnerabilities [

such as injections [

1], [8], [9] and XSS [10]. This versatility

of the taint analysis is due to its capacity to set various inputs

in the form of rules. At its core, the analysis follows the path

https://pypl.github.io/PYPL.html

http://surl.li/cfrcc

between so-called so urces, where the taint is created , until so-

called sinks, where the taint is reported. The information for

the sources and sinks is o ften encoded in a rule v ia a domain-

speciﬁc languag e (DSL).

For Java, ther e are many existing taint analyses [

11],

[12] that can be used to detect many taint-style security

vulnerabilities. Since Kotlin compiles to the Java bytecode,

theoretically, one c an use existing Java taint analyses on Kotlin

code. However, the Kotlin compiler generates the bytecode

differently than that of Java. This leads to the question : can

one use tain t analysis tools intended for Java to analyze Kotlin

programs, or must one reinvent the wheel?

In this paper, we report the result of an exploratory study

that we conducted to address this question. We analyzed the

Kotlin-generated bytecode for each language construct and

compare d it to the Java equivalent. We used the Jimple in te r-

mediate representation generated by the Soot framework [13]

for this comparison. For completeness, we used the ofﬁcial

Kotlin documentation [

14] and created a micro benchmark

with 294 simple Kotlin programs and 135 simple Java pro-

grams, where each program demonstrates a single la nguage

construct. When considering taint analysis, we found that

most Kotlin constructs can be analyzed the same way as

the Java equivalents. However, we also found 18 engineering

challenges that require a different approach. For example,

functions declared as top-level elements do not have a parent

class in the source code. However, the compiler generates a

parent class in the Java bytecode, which the taint analysis

should b e aware of to locate the function correctly. We

propose solutions for eight of these challenges that analysis

writers can implement. As a proof of concept, we extended

an existing Java taint analysis tool, SECUCHECK [

12], by

implementing six of our eight solutio ns, creating a taint

analysis tool SECUCHECK-KOTLIN that supports the standa rd

languag e constructs. Finally, we evaluated the applicability of

SECUCHECK-KOTLIN with the Kotlin version of the PetClinic

application

We present the details of our methodology in Section II.

Then, in Section

III, we report on our ﬁndings from the

study. Next, we present details of our implementation of

https://github.com/spring-petclinic/spring-petclinic-kotlin

SECUCHECK-KOTLIN in Section IV. Finally, we conclude and

present our future work in Section V.

II. METHODOLOGY

We examined the intermediate representation (IR) of the

Kotlin code and—if existing—the equivalent Java code. O ur

methodology consists of automatic IR generation with meta-

data usef ul for our examination, which is a manual step

that follows. We examined the following: (1) whether the

generated IR for Kotlin is valid and can be analyzed the

same wa y as the IR from equivalent Java code, (2) whether

there are difﬁculties due to the deﬁnition of sources and sinks,

and (3) whether there are language con structs in Kotlin that

the analysis needs to handle in a new unique way when

compare d to Java. We did not consider challenges that can

occur due to the callgraph-generation algorithms or com puting

alias information algorithms.

We used Kotlin’s ofﬁcial documentation [14] to examine

each language construct. During the examina tion, we covered

all construc ts from the “Concepts” section and a few from

the “Standard lib rary” section (Collectio ns, Iterators, Ranges,

and Progressions). We did not consider constructs that were

in the experimental stage at the time of this study. Table

summarizes Ko tlin’s constructs discussed in the ofﬁcial d ocu-

mentation and the those we manually examin ed.

Constructs

#Sub-

constructs

Supported

Types, Control ﬂow, Packages & imports, Null safety,

Equality, This expre ssion, Destructuring declarations,

Ranges and Progressions

11 ✔

Classes and objects (except for Delegated properties) 17 ✔

Functions (except for Builders) 5 ✔

Asynchronous programming techniques, Coroutines,

Annotations, and Reﬂections

4 ✘

Collections and Iterators 2 ✔

Legend

✔: examined all constructs in the category.

✔: examined only the basic constructs.

✘: did not examine in this study.

Table I: List of Kotlin’s features discussed in Kotlin’s ofﬁcial

documentation.

Kotlin targets most Java Development Kit (JDK) versions.

However, the annua l developer ecosystem survey conducted

by JetBrains in 2020 shows that 73% of Kotlin deve lopers

target JDK 8 [

15]. Furthe rmore, Kotlin targets JDK 8 by

default. Therefore, we consider JDK 8 f or this explorator y

study. Additionally, we consider the Kotlin version 1.5.10.

The Kotlin compiler has various options and annotations for

modify ing the compilation process, whic h alters the outp ut of

the compiler, Java byteco de. For this study, we used the default

conﬁguration of the compiler.

For the IR generation, we built a tool that generates Jimple

IR using th e Soot framework [

13]—JIMPLEPROVIDER. The

Jimple code is organized based on the package name. Further-

more, for each class, JIMPLEPROVIDER gene rates metada ta in

a JSON ﬁle that contains information su c h as class name, super

class, implemented interfaces, method count, method signa-

tures, local variables, invo ke expressions, etc. This metadata

helps to identify the challenges easily and quickly. For deeper

examination, we then examine the IR a nd Java b ytecode.

A. Micro benchmark

Using real-world projects for the manual examination is

infeasible because a real-world project has a complex mix of

many constructs, making it hard to identify them clearly in

Jimple. Therefore, we built a micro benchmark suite classiﬁed

into two groups—Ko tlin suite and Java suite. The Kotlin

suite consists of small Kotlin programs, ea ch focusing on one

particular Kotlin construct. If a corresponding feature exists in

Java, then an equivalent program is p resent in the Java suite.

The suits contain six main categories: basics (43 Ko tlin & 36

Java ﬁles), classes an d objects (118 Kotlin & 80 Java ﬁles),

functions (27 Kotlin & 4 Java ﬁles), generics (8 Kotlin & 10

Java ﬁles), un ique to Kotlin (87 Kotlin ﬁles), and collection

(11 Kotlin & 5 Java ﬁles). Table

II provides the ove rview of

the Kotlin suite and the important f e atures in the six categories.

Categories in

Kotlin su ite

Major features #Kotlin

ﬁles

basics

data types, control ﬂow, package, import,

exceptions, equality, operators, variables

classesAndObjects

classes, enum class, inline class, sealed class,

nested / inner class, interface, functional

interface (SAM), object expression, object

declaration, delegation, qualiﬁed this, type

aliases, visibility modiﬁers

118

functions

simple functions, default arguments, local

functions, inﬁx notations. tail recursive function,

varargs

generics

simple generic type, generic functions, raw

types, upper bounds,

uniqueToKotlin

data class, destructuring declaration, extensions,

higher-order functions, inline functions, null

safety, operator overloading, primary

constructor, properties, ranges, progressions,

smart cast, string template, declaration site

variance, type projection

collection collection and iterators 11

Table II: O v erview of Kotlin suite.

B. Manual examination

The manual examination of the Jimp le cod e was per formed

by the ﬁrst author, who has mo re than 4.5 years o f soft-

ware development experience and is a Ph.D. student f ocusing

on program ming languages and static analysis. The mo re

complex constructs, especially those speciﬁc to Kotlin or

with differences from Java, were discussed with the second

author, a Ph .D. student in th e last year with expertise in the

static analysis, and an external researcher with professional

experience in Kotlin development. The examiners used the

JIMPLEPROVIDER to generate the IR f or the entire micro

benchm ark. Th e n, ea c h construct was inspected manually.

First, the generated metadata that provides information related

to taint ana lysis is studied. Next, the g enerated IR is che c ked

for a deeper examination. If more information is needed, then

the gener a te d b ytecode is examined. Based on this, th e exam-

iner conclude d whether a construct requires special handling

in Kotlin taint analysis compar ed to Java taint analysis.

C. Threats to Validity

Our study involves a manual step, making it possible that

some of the ﬁnd ings are inco mplete o r incorrect. Furthermore,

the programs written in the micro benchmark suite are based

on perso nal experience. Therefore, some advanced use cases

may be missing. As discussed earlier in this section, we

considered the Kotlin version 1.5.10 and the target JDK 8.

However, there is a risk that for some of the constructs, the

Kotlin compiler may generate the bytecode differently for

different versions. Also, for some constructs, the compiler may

generate bytecode differently if some c ompiler options are

used. As stated earlier, we only used the default conﬁg uration.

III. FINDINGS

In Su b-Section

III-A, we present the engineering challenges

we identiﬁed and to which we have proposed a solution. Then,

in Sub-Section III-B, we present the engineering challenges,

which we leave as open issues. Then, in Sub-Section

III-C,

we answer two research questions for the explorator y study.

A. Engineering challenges with proposed solution

1) Data ty pe mapping: On the bytecode level, some data

types in Kotlin a re mapped to Java data types. For example, the

non-nullable kotlin.Int is mapped to Java’s int. Table

IV summarizes the data type mapping from Kotlin source

code to the Java bytecode. Similarly, the compiler maps the

function type to kotlin.jvm.functions.Function

in the Java bytec ode as described in Table

III. This mapping

is only affected by the number of parameters taken by the

function type. The type of the parameters or return type will

not affect the mapping. Note: the mapping de scribed in Table

III is also valid for the respective nu llable function types.

Due to this data type mapping, the users must provid e valid

method signatures based on the Java bytecode to specify the

source, sink , and other relevant method calls. However, it is

cumbersome for the users to ﬁnd the valid method signatures

in big projects, ma king the tool not usable.

KOTL IN FUNCTION TYPE TYPE IN JAVA BYTECODE

Function type with 0 parameter,

e.g. () → Int

kotlin.jvm.functions.Function0

Function type with 1 parameter,

e.g. (Byte) → Unit

kotlin.jvm.functions.Function1

Function type with 2 parameters,

e.g. (Int, Int) → Int

kotlin.jvm.functions.Function2

...

Function type with 22 parameters kotlin.jvm.functions.Function22

Function type with more than 22

parameters

kotlin.jvm.functions.FunctionN

Table I II: Kotlin function type mapping.

Proposed solution: To handle this challenge, static-

analysis developers can implement a data typ e transformer,

which takes a method signature provided by the users as input.

Then, the tr ansformer checks for the par ameters and return

type in the given method signature. If the par ameters type and

return type are valid Kotlin data types, the transformer replace s

the Kotlin data typ e with the re spective Java data ty pe.

2) Type alias: A type alias allows developers to give a

new name to the existing type. For examp le , in the Kotlin

standard library, ArrayList is deﬁned as a type alias

to java.util.ArrayList. Therefore, ArrayList does

not exist in the bytecode. However, the experts in the Kotlin

programming language know which types ar e deﬁned as type

alias in Kotlin standard libraries. Furthermore, domain experts

in custom libraries such a s cryp tograph ic APIs know what type

aliases are deﬁned in their libraries. On the other hand, users

of the existing Java taint analysis tools may not know such

type aliases and m a y give invalid method signatures.

Proposed solut ion: Static-analysis developers can implement

a feature as part of the D SL that allows domain experts

to spe cify type aliases—type alias speciﬁcations. The DSL

semantics replaces all the type aliases found in the given

method signatures with the original type speciﬁed in the given

type alias speciﬁcations.

3) Property: In Kotlin, a property is a ﬁeld with an

accessor. By default, Kotlin provides a getter and setter fo r

mutable properties; for im mutable properties, the getter only.

Whenever there is acce ss to a prope rty in Kotlin so urce c ode,

the Kotlin compiler uses the respective accessor method in

the Java bytecode. Similar to variables, pro perties can be

tainted. Therefore, the getter and setter of prop erties can be

the source, sink, or propagator methods. Thus, the user nee ds

to be aware of these signatures.

Proposed solution: Static-analysis developers can provide a

feature in the DSL that enables users to specif y a property by

providing the fully qualiﬁed class name in whic h the property

is deﬁned, the property name, and the property’s type . Then,

the valid accessor method signature can be built automa tically.

The pattern for the getter method is <given fully

qualified class name>: <given property’s

type> get<given property name with first

letter caps>(). Similarly, the setter method’s pattern

is <given fully qualified class name>: void

set<given property name with first letter

caps>(<given property’s type>).

4) Top-level members: In Kotlin, top-level members are

deﬁned in a Kotlin ﬁle under a package. Kotlin functions and

properties can be top-level members. These members are not

declared in any class, object, or interface. Therefore, in Kotlin

source code, top-level members can be accessed directly with-

out creating any object or using a class to access it. However,

the Kotlin compiler generates a class in the Java bytecod e

and declares those top-level members as static members in

the generated class. Suppose a novice user wants to specify

top-level members as the source, sanitizer, propag ator, or sink

methods. In that case, the user must iden tify the valid class

name in the method signature of top-level memb e rs.

Proposed solution: To identify a valid class name of top-

level m embers, one needs the ﬁlename an d the package name

in which top-level members are deﬁned. Therefo re, static-

analysis developers can provide a f eature in the DSL that

enables users to specify a function or a property as a top-

level member by providing the packa ge name and the ﬁle

KOTLIN DATA TYPE TYPE I N JAVA BYTECODE

SPECIAL RETURN TYPES

Nothing java.lang.Void

Unit void

BASIC TYPES

Byte byte

Short short

Int int

Long long

Char char

Float float

Double double

Boolean boolean

FEW BUILT-IN CLASS

Any java.lang.Object

Cloneable java.lang.Cloneable

Comparable java.lang.Comparable

Enum java.lang.Enum

Annotation java.lang.Annotation

CharSequence java.lang.CharSequence

String java.lang.String

Number java.lang.Number

Throwable java.lang.Throwable

ARRAY TYPES

Array<Byte> java.lang.Byte[]

Array<Short> java.lang.Short[]

Array<Int> java.lang.Integer[]

Array<Long> java.lang.Long[]

Array<Char> java.lang.Character[]

Array<Float> java.lang.Float[]

Array<Double> java.lang.Double[]

Array<Boolean> java.lang.Boolean[]

Array<Any> java.lang.Object[]

Array<

[]

BASIC TYPES ARRAY

ByteArray byte[]

ShortArray short[]

IntArray int[]

LongArray long[]

CharArray char[]

FloatArray float[]

DoubleArray double[]

BooleanArray boolean[]

IMMUTABLE COLLECTIONS

Collection<T> java.util.Collection<T>

List<T> java.util.List<T>

Set<T> java.util.Set<T>

Map<K, V> java.util.Map<K, V>

Map.Entry<K, V> java.util.Map.Entry<K, V>

Iterator<T> java.util.Iterator<T>

Iterable<T> java.lang.Iterable<T>

ListIterator<T> java.util.ListIterator<T>

MUTABLE COLLECTIONS

MutableCollection<T> java.util.Collection<T>

MutableList<T> java.util.List<T>

MutableSet<T> java.util.Set<T>

MutableMap<K, V> java.util.Map<K, V>

MutableMap.Entry<K, V> java.util.Map.Entry<K, V>

MutableIterator<T> java.util.Iterator<T>

MutableIterable<T> java.lang.Iterable<T>

MutableListIterator<T> java.util.ListIterator<T>

(a) Mapping for non-nullable types

KOTLIN DATA TYPE TYPE I N JAVA BYTECODE

SPECIAL RETURN TYPES

Nothing? java.lang.Void

Unit? Unit

BASIC TYPES

Byte? java.lang.Byte

Short? java.lang.Short

Int? java.lang.Integer

Long? java.lang.Long

Char? java.lang.Character

Float? java.lang.Float

Double? java.lang.Double

Boolean? java.lang.Boolean

FEW BUILT-IN CLASS

Any? java.lang.Object

Cloneable? java.lang.Cloneable

Comparable? java.lang.Comparable

Enum? java.lang.Enum

Annotation? java.lang.Annotation

CharSequence? java.lang.CharSequence

String? java.lang.String

Number? java.lang.Number

Throwable? java.lang.Throwable

ARRAY TYPES

Array<Byte>? java.lang.Byte[]

Array<Short>? java.lang.Short[]

Array<Int>? java.lang.Integer[]

Array<Long>? java.lang.Long[]

Array<Char>? java.lang.Character[]

Array<Float>? java.lang.Float[]

Array<Double>? java.lang.Double[]

Array<Boolean>? java.lang.Boolean[]

Array<Any>? java.lang.Object[]

Array<

[]

BASIC TYPES ARRAY

ByteArray? byte[]

ShortArray? short[]

IntArray? int[]

LongArray? long[]

CharArray? char[]

FloatArray? float[]

DoubleArray? double[]

BooleanArray? boolean[]

IMMUTABLE COLLECTIONS

Collection<T>? java.util.Collection<T>

List<T>? java.util.List<T>

Set<T>? java.util.Set<T>

Map<K, V>? java.util.Map<K, V>

Map.Entry<K, V>? java.util.Map.Entry<K, V>

Iterator<T>? java.util.Iterator<T>

Iterable<T>? java.lang.Iterable<T>

ListIterator<T>? java.util.ListIterator<T>

MUTABLE COLLECTIONS

MutableCollection<T>? java.util.Collection<T>

MutableList<T>? java.util.List<T>

MutableSet<T>? java.util.Set<T>

MutableMap<K, V>? java.util.Map<K, V>

MutableMap.Entry<K, V>? java.util.Map.Entry<K, V>

MutableIterator<T>? java.util.Iterator<T>

MutableIterable<T>? java.lang.Iterable<T>

MutableListIterator<T>? java.util.ListIterator<T>

(b) Mapping for nullable types

Table IV: Data types ma pping from Kotlin source code to the Java bytecode

name in which a top-level member is deﬁned. Then, the DSL

component can build a valid class name for a top-level functio n

or accessors of a top-level property. The rule to build the valid

class name is <given package name>.<given file

name>Kt.

5) Default arguments: In Java, the overload feature can

achieve a default valu e to function or constructor arguments.

However, this increases the number of overloads. Kotlin

avoids this problem by providing a default argument feature

in a constructor or function . For a function or constructor

with default argume nts, the Kotlin compiler generates two

implementations in the Java bytecode. First, the actual im-

plementation with all the parameters as deﬁned in source

code. Second implementation generated by the compiler with

additional argum e nts that determines the de fault arguments’

value and calls the actual impleme ntation. For constructor, the

compiler adds two additional arguments at the end—int and

kotlin.jvm.internal.DefaultConstructorMar-

ker. Similarly, for a function, the compiler adds int and

java.lang.Object at the end. Additionally, if the func-

tion is a member func tion, the com piler adds a ﬁrst argument

of type in which the function is deﬁned. This added ﬁrst argu-

ment is the this-object of the member f unction. Fu rthermore,

for a default argument in a top-level function or membe r func-

tion, the compiler adds the sufﬁx $default to the function

name f or the second implementa tion. If a developer does not

pass va lue to default arguments, then the compiler calls the

second implementation. Suppose users of taint analysis tools

specify a default argument constructor or function as a source

method. In tha t case, the analysis component should iden tify

the second implementation generated by the compiler as a

source method an d track the variables correc tly.

Proposed solut ion: If the analysis fails to identify a method

call as a source, sink, or other relevant method as speciﬁed

in taint-ﬂow speciﬁcations, then the analysis checks for the

second im plementation of the default argument feature. For

each function or constructor in taint-ﬂow speciﬁcations, add

the additional arguments and modify the function name as

described in Sub-Section

III-A5. Su bsequently, if the method

signature matches with the method call’s signature, track the

respective variables. For constructor and top-level function,

track the variables based on the speciﬁed rules for the matched

method in taint-ﬂow speciﬁcations. However, fo r member

functions, since the compiler adds a parameter at the begin-

ning, the analysis should consider this added ﬁrst argument

while tracking the variables. For example, if the this-object

is speciﬁed to track , then track the ﬁrst argument in the Java

bytecod e. Likewise, track the second argument in the Java

bytecod e if the ﬁrst argument is speciﬁed to track and so forth.

6) Extensions: In Kotlin, the extension feature allows

extending an existing class with new members without using

inheritance . However, extensions will not modify and add a

new member to an existing class; instead, the new member

is made accessible using the dot- notation on variables of

the type (r e ceiver type) for which the extension me mber

is deﬁned. In the Java bytecode for a top-level extension

function, the Kotlin compiler adds the receiver type as the

ﬁrst argument, followed by the actual parameters deﬁned in

the source code. Similarly, the compiler adds the receiver

type as the ﬁrst argumen t to the getter method of a top-

level extension property. Note: The compiler generates only

the getter method fo r an extension property. Furthermore, for

top-level companion object extension members, the compiler

also adds the receiver type as the ﬁr st a rgument, followed

by the actual argument deﬁned in the source code. However,

the added ﬁrst a rgument type is the wrapper class generated

for a companion object. The companion object is discussed

in detail in Section

III-B1. Like top-level extension member s,

the compiler also ad ds the receiver type as the ﬁrst a rgument

for an extension d eﬁned as a class member. Furthermore,

Kotlin supports qualiﬁed this-object to access the outer

class’s this-object. For this, the c ompiler considers the actual

this-obje ct (outer class’s this) in the Java bytecode as a

qualiﬁed outer class’s this-o bject in the source code and the

ﬁrst argument in the Java bytecode as a receiver this-object

in the source code. Suppose users want to spec ify an extension

member as a so urce or sink method, then users might give

an invalid method signature since users might not be aware

of the ﬁrst argument of receive r type a dded by the compiler.

Furthermore, if users specify to track the this-object in an

extension member, then the analysis should track the ﬁrst

argument. Likewise, the analysis sho uld track the actual this-

object in the Java bytecode if the outer class this-ob je ct is

speciﬁed to track. Similarly, if users specify to track the ﬁrst

argument in an extension function, then the analysis should

track the second argument and so forth.

Proposed solut ion: To handle extension functions and exten-

sion properties, static-analysis developers should make their

taint-ﬂow speciﬁations aware of these. If this is done thr ough

the DSL for taint-ﬂow speciﬁcations, the DSL can build the

valid method signature by adding the given fully qualiﬁed class

name as the ﬁrst argument. Furthermore, the u sers should not

be able to obtain a setter method from an extension property

since an extension property can not have a setter method.

To handle companion object extensions, static-analysis de-

velopers can provide a feature in the DSL. This featu re

enables th e users to specify a function or prop erty as a

compan ion object extension me mber by providing the fully

qualiﬁed class name and the name of the companion ob-

ject for which the extension is deﬁned. If the name of the

compan ion object is not given, then by d efault, the name is

Companion. From these inputs, the generated wrapper class

for the companion object can be built as <given fully

qualified class name>$<given companion ob-

ject name>. Then, the valid method sign a ture can be built

by addin g this wrapper class as a ﬁrst argu ment.

To handle th e qualiﬁed this-object in extensions as mem-

bers, the DSL should b e able to track the this-object as

extension receiver or dispatch receiver (outer class’s this-

object). If users specify to track this-object as an extension

receiver, modify the tain t-ﬂow speciﬁcation to track the ﬁrst

parameter in the Java bytecod e. Similarly, if users specify

BUILT-IN OPERATOR MAPP ED TO A FUNCTION

UNARY OPERATORS

+obj obj.unaryPlus()

-obj obj.unaryMinus()

!obj obj.not()

++obj obj.inc()

--obj obj.dec()

obj++ obj.inc()

obj-- obj.dec()

ARITHME TIC OPERATO RS

obj1 + obj2 obj.plus(obj2)

obj1 - obj2 obj.minus(obj2)

obj1

obj2 obj.times(obj2)

obj1 / obj2 obj.div(obj2)

obj1 % obj2 obj.rem(obj2)

obj1..obj2 obj.rangeTo(obj2)

AUGMENTED ASSIGNMENT OPERATORS

obj1 += obj2 obj.plusAssign(obj2)

obj1 -= obj2 obj.minusAssign(obj2)

obj1

= obj2 obj.timesAssign(obj2)

obj1 /= obj2 obj.divAssign(obj2)

obj1 %= obj2 obj.remAssign(obj2)

EQUALITY CHECK OPERATOR

obj1 == obj2 obj.equals(obj2)

obj1 != obj2 !(obj.equals(obj2))

BUILT-IN OPERATOR MAPP ED TO A FUNCTION

IN OPERATOR

obj1 in obj2 obj.contains(obj2)

obj1 !in obj2 !(obj.contains(obj2))

INDEX OPERATORS

obj[i] obj.get(i)

obj[i, j] obj.get(i, j)

obj[i, j, k] obj.get(i, j, k)

obj[i1, ..., in] obj.get(i1, ..., in)

obj[i] = obj2 obj.set(i, obj2)

obj[i, j] = obj2 obj.set(i, j, obj2)

obj[i, j, k] = obj2 obj.set(i, j, k, obj2)

obj[i1, ..., in] = obj2 obj.set(i1, ..., in, obj2)

INVOKE OPERATORS

obj() obj.invoke()

obj(i) obj.invoke(i)

obj(i, j) obj.invoke(i, j)

obj(i, j, k) obj.invoke(i, j, k)

obj(i1, i2, ..., in) obj.invoke(i1, i2, ..., in)

COMPARISON OPERATORS

obj1 > obj2 obj.compareTo(obj2)

obj1 < obj2 obj.compareTo(obj2)

obj1 >= obj2 obj.compareTo(obj2)

obj1 <= obj2 obj.compareTo(obj2)

Table V: Built-in operators a nd its co rresponding functions in Kotlin.

to track this-object as dispatch receiver, modify the taint-

ﬂow speciﬁcation to track the actual this-object in the Java

bytecod e. Similarly, for an extension function, if user specify

to track the ﬁrst parameter, then analysis sho uld track the

second parameter and so forth.

7) Inﬁx function: In Kotlin, inﬁx functions are called

using the inﬁx notation, i. e., without the dot n otation a nd the

parenthe ses. The in ﬁx fu nction must be a member fu nction or

extension function and must have a single parameter without a

default value. Similar to a standard function, an inﬁx function

can be a source, sink, and other relevant me thods. However, a

novice u ser of taint analysis tools ma y not know how the inﬁx

function works in the Java byteco de and may provide invalid

method signatures.

Proposed solution: Static-analysis developers can provide a

DSL feature that enables users of taint analysis tools to specify

a function a s an inﬁx func tion by providing a function name,

receiver type, parameter type, and return type. Then, DSL

can build a valid method signature as <given receiver

type>: <given return type> <given function

name>(<given parameter type>).

8) Operator overloading: Operator overloading redeﬁne s

the implementation of the built-in operators with speciﬁc

types. For example, o ne can overload the ++ operator by

deﬁning the function inc on a custom class. The compiler

calls the implemented inc function in the Java bytecode.

Table

V provides the mapping between the built-in opera tor

and the function name. An overloaded operator function can

be a sanitizer or propagator method. However, the novice users

of taint analysis tools ma y not know th e mapping o f the built-

in operators to the function name and may provide invalid

method signatures.

Proposed solution: Static-analysis developers can provide a

feature in DSL that enables users to specify an overloaded

operator by providing the symbol of an operator, type of the

receiver, return type, and the parameter(s) type based on an

operator. Then, DSL can build the valid method signature

by mapping the given operator symbol to the function as

described in Ta ble

B. Engineering challenges without solution (open issues)

1) Companion object: In Kotlin, a c ompanio n object

binds members to a class rather than the instance of

a class. Kotlin ’s companion object is similar to Java’s

static members. However, the Kotlin compiler generates

a wrapper class for each companion object in the Java

bytecod e. The namin g scheme for that wrapper class is

<class name in which the companion object

is defined>$<companion object name>. If the

compan ion object name is not provided in Kotlin source code,

then by default the name is Companion. The c ompiler places

the implementation of that companion object’s members in

the generated wrap per class.

Furthermore, to allow that wrapper class to access the

private members of the ac tual class and vice versa, the com-

piler generates additional functions for each private member.

For a private fun ction, the naming schem e for the generated

function is access$<actual name of the private

function>. Similarly, the namin g scheme for the accessors

of a private property is access$<accessor’s method

name of a property>$cp. The acce ssors’ method name

is discussed in Sub-Section

III-A3.

Due to such implementation of companion objects in the

Java bytecode, users of ta int analysis tools might ﬁnd it

difﬁcult to identify valid method signatures. Additionally, f or

the function that takes a co mpanion object as a parameter,

users must give that parameter type as a generated wrapper

class in the method signature, w hich is not visible in the

source code. Furthermore, the analysis should be aware of

the generated functions for private mem bers, wh ic h might be

a possible source, sink, or propagator.

2) Destructuring declaration: In Kotlin, an object c an be

destructure d into multiple variables in a single statement using

the destructuring declaration . To allow a class to destructure,

that class must have the componentN fun ctions with the

operator key word. These component functions return the

properties of a class. The widely used convention fo r the order

of componentN functions is the order of pr operties deﬁned

in a class. However, it is not mandatory, and developers can

make component functions return any properties of a class.

Suppose the fun c tion component1 returns the ﬁrst prope rty

and the users of taint analysis tools specify the getter method

of the ﬁrst prope rty as a sou rce method. In that case, the

analysis should be able to identify the component1 function

as a source method. Ther efore, the analysis must know the

mapping between the componentN functions and p roperties

of a class to identify a taint-ﬂow in a destructuring declaration.

3) Internal modiﬁer: In Kotlin, a memb er declared with

an internal modiﬁer is only visible inside the mo dule in

which the member is deﬁned. Kotlin deﬁnes a module as a

group of Kotlin ﬁles that are compiled together. In the Java

bytecod e, the Kotlin compiler appends the symbol hyphen

followed by the module name f or the a ccessors of an internal

property and to an internal membe r fun c tion. However, we

did n ot observe this behavior for classes, interfaces, top-

level functions, or accessors of top-level pro perties, which a re

declared as internal. Suppose users of taint analysis tools

specify an internal member fu nction or accessors of internal

property as a sink method. I n that case, the analysis component

should identify the modiﬁed name with the appended m odule

name as a sink method . Otherwise, the analysis componen t

fails to detect taint-ﬂow in internal member functions and

properties. Note: if there is a symbol hyphen in the module

name, the Kotlin compiler replaces it with the underscore

before appending it to the in ternal member fu nctions and

accessors of internal p roperty in the Java by te code.

4) Inline cla ss: Kotlin’s inline class wraps an existing

class with improved performance compared to a manually

created wrapper cla ss. In the Java bytecode, the Kotlin com-

piler generates some of the member functions for an inline

class—constructor, accessor for a p roperty (wrapped class),

toString, hashCode, and equality check. These func-

tions are generated to support the interoperability with Java.

However, the compiler generates the alternative version of

these functions to improve the pe rformance by inlining the

wrapped class in place of wrap per class usage. In addition, the

compiler adds the sufﬁx -impl to the imp roved version of

these functio ns and to the overridden function of an interface.

Additionally, the compiler gene rates box-impl and unbox-

impl function for boxing and unboxing the wrap ped class.

The Kotlin compiler calls the -impl version of member

functions wherever it is possible to improve th e perform ance.

Suppose users of taint analysis tools specify the member

functions o f an inline class as a source. In that case, along

with the actual implementation, the analysis should identify

its -impl version as a source. Otherwise, the existing Java

taint analysis tools fail to detect taint-ﬂows in an inlin e class.

5) Function returning anonymous object: In Kotlin, ob-

ject expressions create objects of an anonymous class. Every

object expression has at least one base class. The Kotlin

compiler generates a wrapper class for each instance of object

expression in the Java bytecode similar to Java. However, in

contrast to Java, the re turn type in Kotlin’s function is not

mandatory to specify, and the compiler c an infer the type.

Suppose a function is private and retu rns an anonymous object.

In that case, the compiler infe rs the retu rn type as the generated

wrapper c lass, which is not visible in the source code. This

makes it challenging for the users to identify the valid method

signature of a private function that returns a nonymous object.

6) Local functions: Kotlin suppor ts local functions, which

are functions inside other functions. These local func tions

can access the outer functions local variables. For a local

function, the Kotlin compiler generates a static function in the

Java bytecode. The naming scheme and the parameters of this

static function are <outer function name>$<local

function name><-digits starting from 0 if

there are multiple local function with th-

e same name>(<outer functions local varia-

bles if accessed by local function>, <this

object if the outer function is a member

function>, <actual parameter as defined f-

or the local function in Kotlin source co-

de>). Additionally, if a local function accesses an mutable

local variable of an outer function, then the compiler passes

the reference type to reﬂect the changes in the outer fu nction.

For example, if the local function access mutable Int

type, then in the Java bytecode the Kotlin compiler passes

the kotlin.jvm.internal.Ref$IntRef type to

the generated static function as a param eter. Due to such

implementation of local function in the Java bytecode, it is

challengin g for the user to ide ntify the valid method signature

of a local fu nction. Furthermore, the analysis must handle the

accessed local variables of the o uter functions to tr ack the

tainted variable.

7) Higher-order functions: Kotlin provides a function type

that enable s higher-order function in Kotlin. These function

types are mapped to kotlin.jvm.functions.Functi-

types in the Java byteco de as described in Table

III.

Furthermore, there are ﬁve ways to create an instance of

a function type in Kotlin—lambda expression, anonymous

function, function literal with a receiver, callable reference,

and instan c es of a custom class that implements a function

type. The Kotlin compiler genera tes a wrapper class for each

instance of a function type in Kotlin source code. The naming

scheme for this wrapper class is <class name in which

the lambda is declared>$<function name in

which the lambda is declared>$<variable n-

ame in which the lambda expression is sto-

red if any otherwise this is optional>$<d-

igits starting from 1>. This wrapper class overrides

the interface function invoke, in which the Kotlin compiler

places the imp le mentation of a lambda expression.

Similar to the local functions accessing the outer function’s

local variables as discussed in Sub-Section

III-B6, lambda

expressions can also access the outer function’s local variables.

All the accessed variables are passed to the constructor of

the wrapper class. Then, the constructor stores these val-

ues in its ﬁelds, which can be acce ssed in the invoke

method. Furthermore , if the outer function’s local va riable

is immutable, the compiler passes the reference type, e.g.

kotlin.jvm.internal.Ref$IntRef. For an anony-

mous function, the compiler generates the Java bytecode

similar to the lambda expression. Similarly, for a class im-

plementing a f unction type, the compiler implements the

kotlin.jvm.functions.Function

in the Java byte-

code and implements the interface method invoke.

For a function literal with a receiver, the compiler generates

the Java bytecode similar to the lambda expression, except

that the receiver object is p a ssed as the ﬁrst argument to

the invoke method. For callable reference, the compiler

generates the Java bytecode similar to a lambda expression.

However, the receiver of a callable reference is passed to

the constructor of the generated wrapper class, which stores

the receiver in the superclass’ ﬁeld called receiver. Later,

the function invoke access the ﬁeld receiver to call the

respective me mber function.

Java uses invokedynamic instruction for lambda ex-

pression. Therefore, the existing Java taint analysis tools

detect taint-ﬂows in lambda expressions in Java by han-

dling the invokedynamic instruction in the Java byte-

code. However, by default, the Kotlin compiler does not use

invokedynamic instruction for an instance of a function

type, which leads to the existing Java taint analysis tools fail-

ing to detect taint-ﬂows in higher-order functions. Therefore,

the analysis must handle th e generated wrapper class for an

instance of a function type to track the tainted information.

Furthermore, the analysis should handle the receiver property

to track the tainted receive r object for a callable reference.

Furthermore, similar to local functions (

III-B6), the analysis

should handle the accessed local variables of the outer func-

tions to track the tainted variable.

Note: for a functional interface or a Single Abstract Meth od

(SAM), the Kotlin com piler g enerates the Java bytec ode

similar to the Java’s lambda expression by default, i.e.,

invokedynamic instruction in the Java bytecode.

8) Inline function: As discussed in Sub-Sectio n

III-B7,

the Kotlin compiler generates a wrapper class for each in-

stance of a function type, captures the outer function’s local

variables, which leads to extra memory allocations, and extra

virtual method ca ll introduces runtime overhead. However,

in some scenarios, such runtime overhead can be elim inated

by inlining the lambda expre ssion ra ther than crea ting an

instance of a f unction type. For this pu rpose, Kotlin provides

inline functions. For example, the println function in

Kotlin is dec lared as inline, which calls the Java’s function

System.out.println. Therefore, in the Java bytecode,

we ﬁnd the System.out.println function call in place

of Kotlin’s println c all site. Sim ilarly, custom higher-order

functions can also be declared as in line in Kotlin. Suppose

users of taint analysis tools specify an inline function as a sink

method. In that case, taint analysis tools fail to detect taint-ﬂow

that reaches this sink m ethod since there is no actual method

call of an inline function in the Java byteco de. Therefore,

taint analysis tools mu st know the propagation rule for all the

method calls in the bo dy of that in line function. Othe rwise, it

fails to detect ta int-ﬂows in inline functions.

9) Sealed class: A sealed class restricts users from in-

heriting a class or interface, and all the derived classes

are known at compile time. To achieve this, the Ko tlin

compiler ma kes the constructor private and overloads the

constructo r with an additional parameter at the end—

kotlin.jvm.internal.DefaultConsructorMark-

er. This allows the compiler to call the overloaded constructor

for the known derived class and restricts developers from

creating a new derived class. Suppose users of taint analysis

tools specify the constructor of a sealed class as a propagato r

method. In that case, the analysis must identify the overloaded

constructo r as a propagator. Oth e rwise, taint analysis to ols fail

to detect taint- ﬂows in a sealed class’s constructor.

10) Package: In Java, the package name must match the

path of that Java ﬁle. However, in Kotlin, th e package name

can be different than the path of that Kotlin ﬁle. Once the

analysis component completes and returns the found results,

some existing Java taint analysis tools use the package name

to build the path of the Java ﬁle to display the errors in an IDE.

However, if the Kotlin ﬁle’s path is different from its package,

then taint a nalysis tools fail to display the found taint-ﬂows

in an IDE.

C. Research Questions

In the previous two sub-sections, Sub-Section

III-A and

Sub-Section III-B, we discussed the various enginee ring chal-

lenges that mu st b e handled in the existing Java taint analysis

tools to support taint analysis on Kotlin code. In this sub-

section, we answer two research questions (

RQ1 and RQ2),

which evaluates o ur exploratory study.

RQ1: Which Kotlin’s features can be analyzed by the existing

Java taint analysis tools without any engineering challenge?

To answer this research question, we list the Kotlin’s

features for which the Kotlin compiler generates the Java

bytecod e similarly to the Java compiler. The existing Java

taint analysis tools can analyze Kotlin program s c ontaining

these features without any engineering challenges. For all the

features listed under this research question, Soot generates

the valid Jimple code. Furthermore, th e analysis component

can perform taint a nalysis on these fe atures and requires no

additional constructs in the DSL co mponent to handle these

features.

Kotlin’s features

Similarity

level

similar to

Explicit conversion ✔*

typecasting and Java’s methods like intValue,

byteValue etc.

Arithmetic operators, Bitwise operators, Comparison operators, assignment

operators, unary operators, logical op erators, equality check, L iteral constants,

varargs

✔

is operator, unsafe cast op erator, safe cast operator ✔* instance check (instanceof), typecasting

when construct ✔*

lookupswitch, tableswitch, comparison,

goto and label

for construct ✔* Java’s for, iterators

while, do-while, if construct ✔

return, break, continue, labeled break and labeled con tinue, and qualiﬁed this

in nested / inner class

✔

labeled return (non-local return) ✔* goto and label statements

try-catch, ﬁnally, throw ✔

import, named arguments ✔

Open class ✔ non-ﬁnal class

Abstract class, inheritance, overriding methods, calling super class implemen-

tations, multiple inheritance

✔

Functional interface (SAM) ✔* Java’s lambda expression (invokedynamic)

Generics ✔

Nested class, inner class, enum class ✔

Object expression ✔ instance of anonymous class

Object declaration ✔* singleton pattern

Delegation (in inheritance) ✔* Delegation pattern

varargs ✔

Tail-recursive function ✔ normal function and loops

String template ✔*

Java’s StringBuilder append, Kotlin’s

stringPlus methods

Smart cast ✔* Typecasting after instance check

lateinit ✔*

uninitialized ﬁeld, null check, Null pointer

exception (NPE)

Null safety ✔* null check, goto and label statements, NPE

Default implementation in interface ✔

Legend

✔: similar to the respective feature in J ava.

✔: similar to Java, but the naming scheme of the generated wrapper class is different compared to Java.

✔: completely different to Java, but there is no challenge concerning taint analysis in DSL, analysis or IR generator components.

*: similar to Java’s features, some of them are not visible in the source code.

Table VI: List of Kotlin’s features, for which the existing Java taint a nalysis tools can analyze without any c hallenge.

Table

VI summarizes the features of Kotlin that can be

analyzed by the existing Java taint analysis tools.

✔ represents

Kotlin’s features, for which the Kotlin compiler generates the

Java bytecode similar to the respective features in Java. *

represents Kotlin’s features, for which the Kotlin compiler gen-

erates the Java bytecode similar to some features (third column

in the table) in Java, which are not visible in Kotlin so urce

code. For example, the explicit conversion f rom the Number

type to the Int in Kotlin is pe rformed using the method

toInt. How ever, in some scenarios, the Kotlin compiler uses

the intValue method in the Java bytecode. Furthermore,

the Kotlin compiler uses the StringBuilder append

method for the String template, a nd in some scenarios, it uses

Kotlin’s stringPlus method. Therefore , we recommend us-

ing Java’s general propagator methods while analyzing Kotlin

programs.

✔ represents Kotlin’s features, for which the Kotlin

compiler generates the Java bytecode similar to the respective

features in Java. However, the naming scheme of the generated

wrapper class by the Kotlin compiler is different compar e d to

that of the Java compiler. For the default implementation in

an interface, th e Kotlin compiler gener ates the Java bytecode

differently from Java

✔. The Java compiler keeps the default

implementation in the interface. However, the Kotlin compiler

keeps only the abstract methods in the interface, and the

default im plementation is placed in the generated wrapper

class. For exam ple, suppose a developer uses the default

implementation method in a class, which implements that

interface. In that case, the Kotlin compiler overrides that

method automatically and calls the default implementation

present in the wrapper class. Whenever that default method

is called, the Kotlin compiler calls the virtual method from

the object of th e derived class or interface similar to the Java

compiler. Therefore, there is no engine ering challenge with

this feature in the DSL, analysis, and IR ge nerator comp onent.

However, there ma y be some challenges with this feature in

other compone nts suc h as the call graph generator com ponent.

Kotlin ’s features Engineering challenges

can be solved

Note

Data types Data type mapping (III-A1) ●

Exception types and type alias Type alias (III-A2) ●

Kotlin’s exception types are deﬁned as type aliases to

Java’s exception types.

Top-level functions and top-level properties Top-level members (III-A4) ●

Package Package (III-B10) ✪

This challenge can also be solved in the component that

integrates the analysis with the IDE.

Constructor with default arguments and function with default

arguments

Default argument (III-A5) ★

Internal visibility modiﬁer Internal modiﬁer (III-B3) ✪

Sealed class Sealed class (III-B9) ✪

Inline class Inline class (III-B4) ✪

Function returning anonymous object Function returning anonymous object (III-B5) ✪

Companion object Companion object (III-B1) ✪

Inﬁx function Inﬁx function (III-A7) ●

Local functions Local functions (III-B6) ✪

Qualiﬁed this object

qualiﬁed this object in extensions as members (III-A6),

qualiﬁed this object in function with receiver type (

III-B7)

✪

qualiﬁed this object in nested / inner class is same as

Java’s qualiﬁed this in nested / inner class. Therefore,

there is no challenge in this s cenario

Destructuring declaration Destructuring declaration (III-B2) ✪

Properties accessors Properties accessors (III-A3) ●

Extension function, extension property, companion object ex-

tension and extensions as members

Extensions (III-A6) ✪

Data class Destructuring declaration (III-B2), default argument (III-A5) ✪, ★

For a data class, the Kotlin compiler automatically gen-

erates the componentN function for the destructuring

declaration

✪. Additionally, it also generates the copy

function with the default value. Therefore, this feature

also has the challenge of default argument for the copy

function

★.

lambda expression, anonymous function, function literal with a

receiver, callable reference, class implementing function types

Higher-order function (III-B7), function type (III-A1) ✪

inline function Inline function (III-B8) ✪

Operator overloading Operator overloading (III-A8) ●

Ranges and Progressions

Top-level members (III-A4), inﬁx function (III-A7), extens ions

(

III-A6

)

●

In Ranges and Progressions, the methods until,

downto, and step are deﬁned as top-level, extension,

inﬁx function.

Collections and Iterators

Data type mapping (III-A1), Top-level members (III-A4),

extensions (

III-A6) and destructuring declarations (III-B2)

● and ✪

Kotlin uses other features s uch as extensions, destruc-

turing declaration, etc., to deﬁne the members of collec-

tions and iterators. In addition, some of the collection

types are mapped to Java’s collection types, and some

collection types are deﬁned as type aliases to Java’s

collection types.

Legend

●: Engineering challenge(s) can be solved in the DSL component.

★: Engineering challenge(s) can be solved in the analysis component.

✪: Engineering challenge(s) can be solved either in the DSL or analysis component (depends on the analysis designer decision).

Table VII: List of Kotlin’s features that re quires an extension in the DSL or analysis components of the existing Java taint

analysis tools to support taint analysis fo r Kotlin programs.

RQ2: For which Kotlin’s features, the existing Java taint

analysis tools need an exten sio n to support taint analysis for

Kotlin programs?

To answer this research question, we list the Kotlin’s

features for which the Kotlin compiler generates the Java

bytecod e differently from the Java compiler. Such differences

makes an engineering challenge s in the existing Java taint anal-

ysis tools analyzing Kotlin programs, as discussed in Section

III. Static analysis developers must handle these challegens in

the DSL, analysis, or IR generato r components.

Table

VII summarizes the features of Ko tlin that re quire an

extension in the existing Java taint analysis tools to supp ort

taint analysis for Kotlin. The engine e ring challenges associated

with Kotlin’s features are given in the second column. If a

Kotlin’s feature can be handled in the DSL component, then

we categorize that feature into

●. Furthermore, suppose a

challenge can be solved in the analysis component without

any input from the users of taint analysis tools. In that

case, we categorize that feature into

★. For instance, we

can solve the default argument challenge (Sub-Section III-A5)

in the analysis component without im plementing additional

constructs in the DSL component.

We did not propose a solution for the ch allenges d iscussed

in Section

III-B. However, we can handle these challenges

in the DSL or an a lysis com ponent based on the solution and

the analysis d esigner’s decision. Kotlin’s f e atures associated

with these challenges are categorized into

✪. Additio nally, for

all the features of Kotlin that we manually examined in this

exploratory research, Soot generates th e valid Jimple code.

IV. SECUCHECK-KOTLIN

As a pro of of concept, we extended an existing Java taint

analysis tool called SECUCHECK [12] b y implementing the

solution fo r six of the engineering challenges discussed in Sub-

Section

III-A. For the taint analy sis, SECUCHECK uses Jimple

IR [16] generated by SOOT [17]. Furthermor e , SECUCHECK

provides a DSL called ﬂuentTQL [7] for specifying taint

ﬂows. First, we discuss the impleme ntation of SECUCHECK-

KOTLIN in Sub-Section

IV-A. The n, we evaluate the applica-

bility of SECUCHECK-KOTLIN in Sub-Section IV-B.

A. Implementation

Table

VIII summarizes the list of challenges th at we handled

in SECUCHECK-KOTLIN. We implemented the solutions for

these challenges without modifying the existing architecture

of SECUCHECK.

For handling the data type mapping discussed in Sub-

Section

III-A1, we implemented a data type transformer

module in ﬂuentTQL

●*. This transformer checks whether

the given type in a meth od signature is a valid Kotlin type

Challenges Solved in Newly added constructs in ﬂuentTQL

Data type mapping

(

III-A1)

●* -

Type alias

(III-A2)

●

TypeAliases class for experts in Kotlin and domain-experts

in custom libraries. The object of TypeAliases are

accepted in MethodSignatureBuilder,

MethodSelector, and MethodConfigurator.

Property

(III-A3)

● property, getter, and setter methods

Top-level members

(III-A4)

● topLevelMember method

Extensions

(III-A6)

●

extensionFunction and extensionProperty

methods. For handling qualiﬁed this object challenge,

provides constants

QualifiedThis.DISPATCH_RECEIVER and

QualifiedThis.EXTENSION_RECEIVER

Default argument

(III-A5)

★ -

Legend

●: solved in ﬂuentTQL DSL.

●*: solved in ﬂuentTQL without implementing new construct in ﬂuentTQL DSL.

★: solved in the analysis compon ent.

Table VIII: List of found engineering challenges handled in

SECUCHECK-KOTLIN.

or function type, as described in Tables III and IV, and

transforms the given type into respec tive Java data type. This

allows users to provid e Kotlin types such as kotlin.Int,

kotlin.Int?, etc., in a method signature. In addition, users

can also provide short type names such as Int, Int?, etc.

Furthermore, for a function type, users can provide a regular

expression such as “() →

”, or a function type itself

such as “() → String”. The transformer looks f or the

function type expression and transforms it into a valid data

type, as summarized in Table

III. The limitation of this current

implementation is that users can not provide complex function

types such as (Int) → (Int, Int) → String. For

such function types, users must use regular expressions, e.g.,

(

) → . Additiona lly, suppose users want to specify to

track a parameter of function type in ﬂuentTQL. In that case,

users m ust explicitly specify the pr opagation rules for the

invoke method of the Function

class as discussed in

Sub-Section

III-B7.

For handling type alias (III-A2), property (III-A3), top-

level mem bers (

III-A4), and extensions (III-A6), we im ple-

mented new constructs in ﬂuen tTQL that helps the users

to specify the respective features ●. Listing 1 demon-

strates the way of specifying type aliases and extension

property in ﬂuentTQL of SECUCHECK-KOTLIN. For the

type alias challen ge, we implemented the TypeAliases

class in ﬂuentTQL, which experts in Kotlin pro gramming

languag e or the doma in experts in custom libraries can use

to specify typ e aliases. For instance, experts in the Kotlin

programming language can specify the type aliases deﬁned

in the Kotlin standard library as shown in Lines 2-6. Then,

the users of ﬂuentTQL can use the speciﬁed type aliases

in MethodConfigurator (Line 14), MethodSelector,

or MethodSignatureBuilder, which replaces the given

type a lias with the original type as speciﬁed by the experts.

Note: SECUCHECK provides MethodSignatureBuilder

for novice users to build a method signature with ﬂuent

interface. Similarly, it provides MethodConfigurator and

MethodSelector for conﬁguring methods with ta int infor-

mation using ﬂuent interface.

For handling the proper ty, top-level members, an d ex-

tensions, we provide the methods in the ﬂuent interface

of MethodSignatureBuilder. For example, for proper-

ties, the methods are property, getter (Line 12), and

setter. If a property is an extension, then the method

is extensionProperty (Line 11). This function takes

three arguments—receiver type, property name, and prop-

erty type. From these inputs, ﬂuentTQL builds the valid

method signature. Similarly, extensionFunction method

for specifying extension fu nctions and the topLevelMem-

ber method for specifying top-level members. For han-

dling the qua liﬁed this-object in extensions, we pro-

vide two constants—Qualified.DISPATCH_RECEIVER

and Qualified.EXTENSION_RECEIVER, which can

be used in the method thisObject (Line 15) of

MethodConfigurator to track the respective this

object. The limitation for this implementation is that

these methods are only available in th e method chain

of MethodSignatureBuilder and not availab le for

MethodConfigurator and MethodSelector. Simi-

larly, the qualiﬁed this constants are only available

for MethodConfigurator, and it is not available for

MethodSelector. Fin a lly, we handled the challenge of

default argument in the analysis component

★, as proposed

in Sub-Section III-A5.

B. Evaluation

To evaluate the applicability of SECUCHECK-KOTLIN, we

found a vulnerable version of the Spr ing PetClinic application

written in Kotlin

. This project contains 27 Kotlin ﬁles with

six known hibernate injections as summarized in Table IX.

Project Name

#Kotlin-ﬁles

#Taint-ﬂows

#Queries

#Found-ﬂows

#Runtime (s)

Display

error messages?

Display

line numbers?

Display

ﬁle locations?

spring-petclinic-kotlin

(vulnerable)

27 6 5 6 11.05 ✔ ✔ ✔

#Kotlin-ﬁles: Number of Kotlin ﬁles in the project

#Taint-ﬂows: Number of known taint-ﬂows in the pro je c t

#Queries: Number of speciﬁed taint-ﬂow queries in ﬂuentTQL of SE CUCHECK-KOTLIN

#Found-ﬂows: Number of found taint-ﬂows by SECUCH E CK-KOTLIN

#Runtime (s): Runtime in seconds (average of 10 runs)

Table IX: Overview of SECUCHECK-KOTLIN analysis results.

SECUCHECK-KOTLIN found all the six taint-ﬂows with the

run time of 11.05 seconds (average of 10 runs). SECUCHECK-

KOTLIN successfully displayed the valid line numbers of

the source and sink methods. It also displayed the cus-

tomized error message as well as the d e scriptive message s

from the ﬂuentTQL2ENGLISH translator [

12]. Additionally,

SECUCHECK-KOTLIN displayed the ﬁle locations of the

source and sink methods. However, SECUCHECK through the

command prompt displays the ﬁle loc a tion of the classes

https://shorturl.at/hvyRS

1 // Specified by Kotlin programming language experts.

2 static TypeAliases typeAliases = new TypeAliases(){{

3 add("ArrayList", "java.util.ArrayList");

4 add("HashSet", "java.util.HashSet");

5 ...

6 }};

8 // Specified by the users of fluentTQL

9 public MethodSignature signature = new MethodSignatureBuilder()

10 .atClass("de.fraunhofer.iem.EmployeePrinter")

11 .extensionProperty("de.fraunhofer.iem.Employee", "nameLength", "Int")

12 .getter();

14 public Method source1 = new MethodConfigurator(signature, typeAliases)

15 .in().thisObject(QualifiedThis.DISPATCH_RECEIVER)

16 .out().returnValue()

17 .configure();

Listing 1: Ex ample of type alias, extension property in ﬂuentTQL of SECUCHECK-KOTLIN.

instead of the Java ﬁles in the Static An alysis Results Inter-

change Format (SARIF) [18] output. Therefore, SECUCHECK-

KOTLIN has no problem in displaying the valid ﬁle loc ations.

However, suppose developers want to display the ﬁle location

of th e Kotlin ﬁles instead of the class ﬁles in the SARIF output.

In that ca se, ﬁle location of the Kotlin ﬁles has to be identiﬁed

based on the challenge we discussed in Section

III.

V. CONCLUSION AND FUTURE WORK

In this paper, we presented our exploratory study for Kotlin

taint ana lysis, which shows that most of the Kotlin constructs

can be analyzed by an existing Java taint analysis tool.

However, we found 1 8 engineer ing challenges that must be

handled differently than the Java taint analysis. For eight

of these challeng es, we proposed solutions. Finally, as a

proof of concept, we extended an existing Java taint analysis,

SECUCHECK, by implementing six of these solutions, which

led to SECUCHECK-KOTLIN. We evaluated the applicability

of SECUCHECK-KO TLIN, which found all the six expected

taint-ﬂows. In the future, we plan to work on the open issues

from Sub-Section

III-B and extend the implementation of

SECUCHECK-KOTLIN, after which a thor ough evaluation with

real-world applications can b e performed.

REFERENCES

[1] “CWE-89: Improper Neutralization of Special Elements used in

an SQL Command,”

https://cwe.mitre.org/data/deﬁnitions/89.html, ac-

cessed: 2021-June-22.

[2] “KTLINT: An anti-bikeshedding Kotlin linter with built-in formatter,”

https://github.com/pinterest/ktlint, accessed: 2021-December-14.

[3] “DETEKT: static analysis for Kotlin,” https://github.com/detekt/detekt,

accessed: 2021-December-14.

[4] “DIKTAT: Strict coding s tandard for Kotlin and a custom s et of

rules for detecting code smells, code style issues and bugs,”

https://github.com/diktat-static-analysis/diKTat, accessed: 2021-

December-14.

[5] “SONARQUBE: automatic code review tool to detect bugs, vulnerabili-

ties, and code smells,” https://docs.sonarqube.org/latest/, accessed: 2021-

December-14.

[6] “SONARQUBE: rules for Kotlin.”

https://rules.sonarsource.com/kotl in ,

accessed: 2021-December-14.

[7] G. Piskachev, J. Sp¨ath, I. Budde, and E. Bodden, “Fluently

specifying taint-ﬂow queries with ﬂuenttql,” Empir. Softw.

Eng., vol. 27, no. 5, p. 104, 2022. [Online]. Available:

https://doi.org/10.1007/s10664-022-10165-y

[8] “CWE-77: Improper Neutralization of Special Ele-

ments used in a Command (’Command Injection’),”

https://cwe.mitre.org/data/deﬁnitions/77.html , accessed: 2021-October-

25.

[9] “CWE-476: NULL Pointer Dereference,”

https://cwe.mitre.org/data/deﬁnitions/476.html, accessed: 2021-June-18.

[10] D. Endler, “The evolution of cross site scripting attacks,” Technical

report, iDEFENSE Labs, Tech. Rep., 2002.

[11] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein,

Y. Le Traon, D. Octeau, and P. McDaniel, “Flowdroid: Precise context,

ﬂow, ﬁeld, object-sensitive and lifecycle-aware taint analysis for android

apps,” Acm Sigplan Notices, vol. 49, no. 6, pp. 259–269, 2014.

[12] G. Piskachev, R. Krishnamurthy, and E. Bodden, “Secucheck: Engineer-

ing conﬁgurable taint analysis for software developers,” in 2021 IEEE

21st International Working Conference on Source Code Analysis and

Manipulation (SCAM), 2021, pp. 24–29.

[13] P. Lam, E. Bodden, O. Lhot´ak, and L. Hendren, “The Soot framework

for Java program analysis: a retrospective,” in Cetus Users and Compiler

Infastructure Workshop (CETUS 2011), vol. 15, no. 35, 2011.

[14] “Kotlin’s ofﬁcial documentation,”

https://kotlinlang.org/docs/home.html,

accessed: 2021-November-21.

[15] “The State of D eveloper Ecosystem 2020,”

https://www.jetbrains.com/lp/devecosystem-2020/kotlin/, accessed:

2021-October-27.

[16] R. Vallee-Rai and L . J. Hendren, “Jimple: Simplifying Java bytecode

for analyses and transformations,” 1998.

[17] R. Vall´ee-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and

V. Sundaresan, “Soot: A java bytecode optimization framework,”

in CASCON First Decade High Impact Papers, ser. CASCON ’10.

Riverton, NJ, USA: IBM Corp., 2010, pp. 214–224. [Online]. Available:

https://doi.org/10.1145/1925805.1925818

[18] S. Kummita and G . Piskachev, “Integration of the static analysis results

interchange format in cognicrypt,” arXiv preprint arXiv:1907.02558,

2019.