SECUCHECK-KOTLIN in Section IV. Finally, we conclude and
present our future work in Section V.
II. METHODOLOGY
We examined the intermediate representation (IR) of the
Kotlin code and—if existing—the equivalent Java code. O ur
methodology consists of automatic IR generation with meta-
data usef ul for our examination, which is a manual step
that follows. We examined the following: (1) whether the
generated IR for Kotlin is valid and can be analyzed the
same wa y as the IR from equivalent Java code, (2) whether
there are difficulties due to the definition of sources and sinks,
and (3) whether there are language con structs in Kotlin that
the analysis needs to handle in a new unique way when
compare d to Java. We did not consider challenges that can
occur due to the callgraph-generation algorithms or com puting
alias information algorithms.
We used Kotlin’s official documentation [14] to examine
each language construct. During the examina tion, we covered
all construc ts from the “Concepts” section and a few from
the “Standard lib rary” section (Collectio ns, Iterators, Ranges,
and Progressions). We did not consider constructs that were
in the experimental stage at the time of this study. Table
I
summarizes Ko tlin’s constructs discussed in the official d ocu-
mentation and the those we manually examin ed.
Constructs
#Sub-
constructs
Supported
Types, Control flow, Packages & imports, Null safety,
Equality, This expre ssion, Destructuring declarations,
Ranges and Progressions
11 ✔
Classes and objects (except for Delegated properties) 17 ✔
Functions (except for Builders) 5 ✔
Asynchronous programming techniques, Coroutines,
Annotations, and Reflections
4 ✘
Collections and Iterators 2 ✔
Legend
✔: examined all constructs in the category.
✔: examined only the basic constructs.
✘: did not examine in this study.
Table I: List of Kotlin’s features discussed in Kotlin’s official
documentation.
Kotlin targets most Java Development Kit (JDK) versions.
However, the annua l developer ecosystem survey conducted
by JetBrains in 2020 shows that 73% of Kotlin deve lopers
target JDK 8 [
15]. Furthe rmore, Kotlin targets JDK 8 by
default. Therefore, we consider JDK 8 f or this explorator y
study. Additionally, we consider the Kotlin version 1.5.10.
The Kotlin compiler has various options and annotations for
modify ing the compilation process, whic h alters the outp ut of
the compiler, Java byteco de. For this study, we used the default
configuration of the compiler.
For the IR generation, we built a tool that generates Jimple
IR using th e Soot framework [
13]—JIMPLEPROVIDER. The
Jimple code is organized based on the package name. Further-
more, for each class, JIMPLEPROVIDER gene rates metada ta in
a JSON file that contains information su c h as class name, super
class, implemented interfaces, method count, method signa-
tures, local variables, invo ke expressions, etc. This metadata
helps to identify the challenges easily and quickly. For deeper
examination, we then examine the IR a nd Java b ytecode.
A. Micro benchmark
Using real-world projects for the manual examination is
infeasible because a real-world project has a complex mix of
many constructs, making it hard to identify them clearly in
Jimple. Therefore, we built a micro benchmark suite classified
into two groups—Ko tlin suite and Java suite. The Kotlin
suite consists of small Kotlin programs, ea ch focusing on one
particular Kotlin construct. If a corresponding feature exists in
Java, then an equivalent program is p resent in the Java suite.
The suits contain six main categories: basics (43 Ko tlin & 36
Java files), classes an d objects (118 Kotlin & 80 Java files),
functions (27 Kotlin & 4 Java files), generics (8 Kotlin & 10
Java files), un ique to Kotlin (87 Kotlin files), and collection
(11 Kotlin & 5 Java files). Table
II provides the ove rview of
the Kotlin suite and the important f e atures in the six categories.
Categories in
Kotlin su ite
Major features #Kotlin
files
basics
data types, control flow, package, import,
exceptions, equality, operators, variables
41
classesAndObjects
classes, enum class, inline class, sealed class,
nested / inner class, interface, functional
interface (SAM), object expression, object
declaration, delegation, qualified this, type
aliases, visibility modifiers
118
functions
simple functions, default arguments, local
functions, infix notations. tail recursive function,
varargs
27
generics
simple generic type, generic functions, raw
types, upper bounds,
8
uniqueToKotlin
data class, destructuring declaration, extensions,
higher-order functions, inline functions, null
safety, operator overloading, primary
constructor, properties, ranges, progressions,
smart cast, string template, declaration site
variance, type projection
87
collection collection and iterators 11
Table II: O v erview of Kotlin suite.
B. Manual examination
The manual examination of the Jimp le cod e was per formed
by the first author, who has mo re than 4.5 years o f soft-
ware development experience and is a Ph.D. student f ocusing
on program ming languages and static analysis. The mo re
complex constructs, especially those specific to Kotlin or
with differences from Java, were discussed with the second
author, a Ph .D. student in th e last year with expertise in the
static analysis, and an external researcher with professional
experience in Kotlin development. The examiners used the
JIMPLEPROVIDER to generate the IR f or the entire micro
benchm ark. Th e n, ea c h construct was inspected manually.
First, the generated metadata that provides information related
to taint ana lysis is studied. Next, the g enerated IR is che c ked
for a deeper examination. If more information is needed, then
the gener a te d b ytecode is examined. Based on this, th e exam-
iner conclude d whether a construct requires special handling
in Kotlin taint analysis compar ed to Java taint analysis.