内容简介:In this post, we’ll look at a two small Java programs with the same flavor: they modify themselves.I know of no practical use for these programs and I make no excuse for that. A while back the thought struck me to write a program which modified its own exe
And the theory behind a potential third and fourth
May 30 ·11min read
In this post, we’ll look at a two small Java programs with the same flavor: they modify themselves.
I know of no practical use for these programs and I make no excuse for that. A while back the thought struck me to write a program which modified its own execution and I just followed that thought.
What is a self-modifying Java program?
Before we begin, let me quickly define what I mean by a self-modifying program.
Trivially, any program that maintains state modifies itself. For instance, the simplest for
loop, for (int i = 0; i < n; ++i)
, is a self-modifying program. The variable i
is a part of the program and is modified; therefore, by definition, this program modifies itself.
I mean something less trivial, though. A self-modifying program is one that modifies its structure not its state. There should be a change in the program’s behavior rather than in its data. Let’s take a more concrete example.
General program structure
Both the programs have the structure shown below. We’ll discuss it after the code itself.
We have a final class
named Replacement0
that has a main
method. This should be the main class we run. The class has a public final
method named dwim
(short for do-what-i-mean
) that accepts three arguments, each of which is also final. The method simply concatenates their string values and returns a String
.
In the above, the crucial lines are lines 16 and 21. I’ve highlighted them. Notice that in line 16 the function prints exactly what one would expect from reading the method dwim
( 3.14159[2, 6, 53]
). However, we get a different output from dwim
at line 21 ( 2.718281828[4, 5]
). In effect, we have modified a final
method of a final class
in Java. How? That’s what we explore here.
Goodhart’s Law states “When a measure becomes a target, it ceases to be a good measure”. Later in the article, we’ll look at some ways to achieve the above goal but without actually modifying the program’s behavior.
Batteries Included
Neither of the two solutions I show below require any additional dependencies. They both work with just plain ol’ Java. They were tested on OpenJDK 11 and AWS Corretto JDK 11.
The two techniques
⒈ Java Debug Interface (JDI) and
⒉ Java Agents API
Talk is cheap. Show me the code , you say? Here’s Replacement1.java , built using Java JDI API, followed by Replacement2.java , built using Java Agents. The actual code would be too much and doesn’t render well so I’ve only shown excerpts. The full code is available on GitHub . These programs are best understood by playing with them so I strongly encourage running them.
⒈ Replacement1.java JDI
7 /** 8 * A Java program that modifies itself using the JDI API. For details 17 * <h3>Compilation Instructions</h3> 19 * javac Replacement1.java 21 * 22 * <h3>Run instructions</h3> 35 * java -agentlib:jdwp=transport=dt_socket,address=2718,server=y,suspend=n \ 36 * Replacement1.java 38 */ 39 public class Replacement1 { 40 public static void main(String[] args) 41 throws Exception { // ... elided because it is the same // ... as Replacement0's main method 55 } 56 57 private final <T> String dwim( // ... elided 62 } 63 64 private static final void doTheDeed() throws Exception { 65 var vmm = com.sun.jdi.Bootstrap.virtualMachineManager(); 66 67 AttachingConnector socketConnector = null; 68 for (var connector : vmm.attachingConnectors()) { 69 if (connector.description().contains("socket")) { 70 socketConnector = connector; 71 break; 72 } 73 } 74 75 var defaultArguments = socketConnector.defaultArguments(); 76 defaultArguments.get("port").setValue("2718"); 77 78 var vm = socketConnector.attach(defaultArguments); 79 var replacementReferenceType = vm.classesByName( "Replacement1").stream() 80 .findFirst() 81 .orElseThrow(); 82 83 vm.redefineClasses(Map.of( replacementReferenceType, replacement())); 84 } 85 86 private static final byte[] replacement() { 87 // byte values elided 158 } 159 }
⒉ Replacement2.java Java Agent
18 /**
19 * A self modifying program which uses the Java Agents API.
* ... some lines skipped
*
24 * Compile Instructions
25 * <pre>
26 * javac Replacement2.java
27 * </pre>
*
29 * Run Instructions
30 * <pre>
31 * java -Djdk.attach.allowAttachSelf=true Replacement2
32 * </pre>
33 */
34 public class Replacement2 {
35 public static void main(String[] args)
36 throws Exception {
// ... elided because it is the same
// ... as Replacement0's main method
50 }
51
52 private final <T> String dwim(
// ... elided
57 }
58
59 private static Instrumentation ageinst;
60
61 public static void agentmain(
62 @SuppressWarnings("unused") String args,
63 Instrumentation inst) {
64 ageinst = inst;
65 }
66
67 private static void doTheDeed() throws Exception {
68 var manifestRows = List.of(
69 "Implementation-Title: selfmod",
70 "Implementation-Version: 0.0.1",
71 "Agent-Class: Replacement2",
72 "Can-Redefine-Classes: true",
73 "Can-Retransform-Classes: true",
74 "Can-Set-Native-Method-Prefix: false");
75
76 var manifest = new Manifest();
77 manifest.read(new ByteArrayInputStream(manifestRows.stream()
78 .collect(joining("\n"))
79 .getBytes(UTF_8)));
80
81 var jarFile = Paths.get("selfmod.jar");
82 try (var out = newOutputStream(jarFile, WRITE, CREATE)) {
83 var jout = new JarOutputStream(out);
84
85 jout.putNextEntry(new ZipEntry("META-INF/MANIFEST.MF"));
86 jout.write(manifestEntries.stream()
87 .collect(joining("\n"))
88 .getBytes(UTF_8));
89 jout.closeEntry();
90
91 jout.putNextEntry(new ZipEntry("Replacement.class"));
92 jout.write(replacement());
93 jout.closeEntry();
94 jout.finish();
95 }
96 jarFile.toFile().deleteOnExit();
97
98 var vm = VirtualMachine.attach(
"" + ProcessHandle.current().pid());
99
100 vm.loadAgent("selfmod.jar");
101
102 ageinst.redefineClasses(new ClassDefinition(
103 Replacement2.class,
104 replacement()));
105 }
106
107 private static final byte[] replacement() {
115 return new byte[] {
// byte values elided
221 };
222 }
223 }
What’s happening?
As was shown in the first gist, both programs share the same technique. They have some common elements which we’ll go over now. We’ll go through the details when we cover each implementation.
All the magic happens in the method doTheDeed
which I’ve shown. I’ve removed ( elided
) the import statements and the methods main
and dwim
, because they’re the same as shown in Replacement0.java
. Both implementations have a method named replacement
which returns a byte array. I’ve elided it too because it’s only a series of bytes. The actual contents of the byte array are crucial to understanding the magic behind this program but actually printing them in this post only hinders understanding. You’ll have to play with it yourself.
Both implementations do the self modification in the method doTheDeed
. They connect to themselves and redefine their own class. Once the class is redefined, the method dwim
returns a different value. This value is dutifully printed.
The work done in doTheDeed
can be divided into three parts: (ⅰ) discovery–necessary setup for connecting to its own process, (ⅱ) connection–actually connect to itself, and (ⅲ) redefinition–change the behavior of the class and method.
doTheDeed
invokes the method replacement
which returns a byte array. This byte array actually is the Replacement class’s new behavior. The byte array’s values are in Java byte-code. There are two aspects to it: its contents and its provenance. I leave both as exercises to the reader.
⒈ Replacement1.java JDI – Java Debug Interface
Replacement1.java solves the problem of discovery and connecty by using the Java Debug Interface. According to Oracle’s official documentation ¹, JDI “provides a pure Java programming language interface for debugging Java programming language applications” and it “defines a high-level Java language interface which tool developers can easily use to write remote debugger applications” ¹. It is one of the three components of Java’s JPDA—Java Platform Debugger Architecture².
The idea being that if you want to write your own debugging tool on top of Java, you could use the JDI to achieve this. And that’s what Replacement1.java does.
Except, where your typical debugger connects to another program, here, our program connects to itself . I find the idea of a program debugging itself extremely cool and somewhat subversive. One might argue that this utterly ordinary: the JDI allows connecting to running Java programs; it doesn’t care if that program is another program or itself. This is true but changes nothing for me. I am impressed.
Ⅰ. Discovery
The program’s run instructions include this vital line:-agentlib:jdwp=transport=dt_socket,server=y,address=2718,suspend=n
This runs the program normally but also starts a Java Debug server ( server=y
) on port 2718 ( address=2718
).
In the method doTheDeed
(see line 63
) the program gets a reference to JDI’s VirtualMachineManager through which it gets a reference to the Socket connector ( transport=dt_socket
).
Ⅱ. Connection
The program connects to itself (see line 72 ) and gets a reference to the VirtualMachine. The same VirtualMachine it’s running on.
The self modification happens with the call to redefineClasses
( line 77
) which surreptitiously changes the definition of the method dwim
and gives us the new result of 2.718281828[4, 5]
. There are also other changes in the byte-code but I’ll leave figuring them out as an exercise to the reader.
⒉ Replacement2.java Java Agent
The Java Instrumentation API, first introduced in Java 1.5, “provides services that allow Java programming language agents to instrument programs running on the JVM. The mechanism for instrumentation is modification of the byte-codes of methods” ³ .
There are two ways to start Java Agents. The first is to start at the same time as the JVM does and the second allows us to start an Agent after the VM starts. We use the second technique. For details see the section §Starting an Agent After VM Startup in the Java Instrumentation page. I repeat the relevant sections:
An implementation may provide a mechanism to start agents sometime after the the VM has started. The details as to how this is initiated are implementation specific but typically the application has already started and its main
method has already been invoked. In cases where an implementation supports the starting of agents after the VM has started the following applies:
The manifest of the agent JAR must contain the attribute Agent-Class
in its main manfiest. The value of this attribute is the name of the agent class
.
⒈ The agent class must implement a public static agentmain
method.
⒉ The agentmain
method has one of two possible signatures. The JVM first attempts to invoke the following method on the agent class:
public static void agentmain(String agentArgs, Instrumentation inst)
Ⅰ. Discovery
Now, to start a Java Agent after the VM starts, you need a .jar file. However, our implementation has only a single file. Not to worry, though. The Java SE provides us a class which can create Jar files— JarOutputStream . We use it to create a file namedselfmod.jar
. We ensure that this file’s manifest points to itself as the Java Agent (see line 71
, "Agent-Class: Replacement2"
).
To prevent messing up our surrounding environment, we also ensure the jar-file is deleted when the program ends (see line 96
, jarFile.toFile().deleteOnExit();
).
Ⅱ. Connection
There are two parts to connecting to the VM. The first is defined by the Java Instrumentation API. This is what the static methodagentmain
does for us. In it we capture the java.lang.instrument.Instrumentation
instance which later redefines the class to give us our new behavior. The second part, is in the program’s run instructions. When we run the program, we must specify the system property -Djdk.attach.allowAttachSelf=true
because starting Java 9 the JDK prohibits self attachment by default ⁴
. (As an aside, I contemplated modifying my program so this wouldn’t be necessary. There were two options of which I’ll admit only the first occurred to me. One was the reflectively change the above flag. The second, more ingenious technique, is to start a child process and have that connect to the parent process. See Byte-Buddy ⁵ for examples of both. Eventually, in the interests of simplicity, I decided being explicit was better.)
Once the Agent starts, it gets the new byte-code from replacement()
which ensures that dwim()
returns a new result.
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网
猜你喜欢:本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。