Two Java programs that modify themselves

栏目: IT技术 · 发布时间: 4年前

内容简介:In this post, we’ll look at a two small Java programs with the same flavor: they modify themselves.I know of no practical use for these programs and I make no excuse for that. A while back the thought struck me to write a program which modified its own exe

And the theory behind a potential third and fourth

In this post, we’ll look at a two small Java programs with the same flavor: they modify themselves.

I know of no practical use for these programs and I make no excuse for that. A while back the thought struck me to write a program which modified its own execution and I just followed that thought.

What is a self-modifying Java program?

Before we begin, let me quickly define what I mean by a self-modifying program.

Trivially, any program that maintains state modifies itself. For instance, the simplest for loop, for (int i = 0; i < n; ++i) , is a self-modifying program. The variable i is a part of the program and is modified; therefore, by definition, this program modifies itself.

I mean something less trivial, though. A self-modifying program is one that modifies its structure not its state. There should be a change in the program’s behavior rather than in its data. Let’s take a more concrete example.

General program structure

Both the programs have the structure shown below. We’ll discuss it after the code itself.

The general structure of the self modifying programs. Did you notice that the code itself refers to this blog and vice versa?

We have a final class named Replacement0 that has a main method. This should be the main class we run. The class has a public final method named dwim (short for do-what-i-mean ) that accepts three arguments, each of which is also final. The method simply concatenates their string values and returns a String .

In the above, the crucial lines are lines 16 and 21. I’ve highlighted them. Notice that in line 16 the function prints exactly what one would expect from reading the method dwim ( 3.14159[2, 6, 53] ). However, we get a different output from dwim at line 21 ( 2.718281828[4, 5] ). In effect, we have modified a final method of a final class in Java. How? That’s what we explore here.

Goodhart’s Law states “When a measure becomes a target, it ceases to be a good measure”. Later in the article, we’ll look at some ways to achieve the above goal but without actually modifying the program’s behavior.

Batteries Included

Neither of the two solutions I show below require any additional dependencies. They both work with just plain ol’ Java. They were tested on OpenJDK 11 and AWS Corretto JDK 11.

The two techniques

⒈ Java Debug Interface (JDI) and

⒉ Java Agents API

Talk is cheap. Show me the code , you say? Here’s , built using Java JDI API, followed by , built using Java Agents. The actual code would be too much and doesn’t render well so I’ve only shown excerpts. The full code is available on GitHub . These programs are best understood by playing with them so I strongly encourage running them.


  7 /**
  8  * A Java program that modifies itself using the JDI API.  For details
 17  * <h3>Compilation Instructions</h3>
 19  * javac
 21  *
 22  * <h3>Run instructions</h3>
 35  * java -agentlib:jdwp=transport=dt_socket,address=2718,server=y,suspend=n      \
 36  *
 38  */
 39 public class Replacement1 {
 40   public static void main(String[] args)
 41       throws Exception {
          // ... elided because it is the same
          // ... as Replacement0's main method
 55   }
 57   private final <T> String dwim(
          // ... elided
 62   }
 64   private static final void doTheDeed() throws Exception {
 65     var vmm = com.sun.jdi.Bootstrap.virtualMachineManager();
 67     AttachingConnector socketConnector = null;
 68     for (var connector : vmm.attachingConnectors()) {
 69       if (connector.description().contains("socket")) {
 70         socketConnector = connector;
 71         break;
 72       }
 73     }
 75     var defaultArguments = socketConnector.defaultArguments();
 76     defaultArguments.get("port").setValue("2718");
 78     var vm = socketConnector.attach(defaultArguments);
 79     var replacementReferenceType = vm.classesByName(
 80         .findFirst()
 81         .orElseThrow();
 83     vm.redefineClasses(Map.of(
            replacementReferenceType, replacement()));
 84   }
 86   private static final byte[] replacement() {
 87       // byte values elided
158   }
159 }

⒉ Java Agent

 18 /**
19 * A self modifying program which uses the Java Agents API.
* ... some lines skipped
24 * Compile Instructions
25 * <pre>
26 * javac
27 * </pre>
29 * Run Instructions
30 * <pre>
31 * java -Djdk.attach.allowAttachSelf=true Replacement2
32 * </pre>
33 */
34 public class Replacement2 {
35 public static void main(String[] args)
36 throws Exception {
// ... elided because it is the same
// ... as Replacement0's main method
50 }
52 private final <T> String dwim(
// ... elided
57 }
59 private static Instrumentation ageinst;
61 public static void agentmain(
62 @SuppressWarnings("unused") String args,
63 Instrumentation inst) {
64 ageinst = inst;
65 }
67 private static void doTheDeed() throws Exception {
68 var manifestRows = List.of(
69 "Implementation-Title: selfmod",
70 "Implementation-Version: 0.0.1",
71 "Agent-Class: Replacement2",
72 "Can-Redefine-Classes: true",
73 "Can-Retransform-Classes: true",
74 "Can-Set-Native-Method-Prefix: false");
76 var manifest = new Manifest();
77 ByteArrayInputStream(
78 .collect(joining("\n"))
79 .getBytes(UTF_8)));
81 var jarFile = Paths.get("selfmod.jar");
82 try (var out = newOutputStream(jarFile, WRITE, CREATE)) {
83 var jout = new JarOutputStream(out);
85 jout.putNextEntry(new ZipEntry("META-INF/MANIFEST.MF"));
86 jout.write(
87 .collect(joining("\n"))
88 .getBytes(UTF_8));
89 jout.closeEntry();
91 jout.putNextEntry(new ZipEntry("Replacement.class"));
92 jout.write(replacement());
93 jout.closeEntry();
94 jout.finish();
95 }
96 jarFile.toFile().deleteOnExit();
98 var vm = VirtualMachine.attach(
"" + ProcessHandle.current().pid());
100 vm.loadAgent("selfmod.jar");
102 ageinst.redefineClasses(new ClassDefinition(
103 Replacement2.class,
104 replacement()));
105 }
107 private static final byte[] replacement() {
115 return new byte[] {
// byte values elided
221 };
222 }
223 }

What’s happening?

As was shown in the first gist, both programs share the same technique. They have some common elements which we’ll go over now. We’ll go through the details when we cover each implementation.

All the magic happens in the method doTheDeed which I’ve shown. I’ve removed ( elided ) the import statements and the methods main and dwim , because they’re the same as shown in . Both implementations have a method named replacement which returns a byte array. I’ve elided it too because it’s only a series of bytes. The actual contents of the byte array are crucial to understanding the magic behind this program but actually printing them in this post only hinders understanding. You’ll have to play with it yourself.

Both implementations do the self modification in the method doTheDeed . They connect to themselves and redefine their own class. Once the class is redefined, the method dwim returns a different value. This value is dutifully printed.

The work done in doTheDeed can be divided into three parts: (ⅰ) discovery–necessary setup for connecting to its own process, (ⅱ) connection–actually connect to itself, and (ⅲ) redefinition–change the behavior of the class and method.

doTheDeed invokes the method replacement which returns a byte array. This byte array actually is the Replacement class’s new behavior. The byte array’s values are in Java byte-code. There are two aspects to it: its contents and its provenance. I leave both as exercises to the reader. JDI – Java Debug Interface solves the problem of discovery and connecty by using the Java Debug Interface. According to Oracle’s official documentation ¹, JDI “provides a pure Java programming language interface for debugging Java programming language applications” and it “defines a high-level Java language interface which tool developers can easily use to write remote debugger applications” ¹. It is one of the three components of Java’s JPDA—Java Platform Debugger Architecture².

The idea being that if you want to write your own debugging tool on top of Java, you could use the JDI to achieve this. And that’s what does.

Except, where your typical debugger connects to another program, here, our program connects to itself . I find the idea of a program debugging itself extremely cool and somewhat subversive. One might argue that this utterly ordinary: the JDI allows connecting to running Java programs; it doesn’t care if that program is another program or itself. This is true but changes nothing for me. I am impressed.

Ⅰ. Discovery

The program’s run instructions include this vital line:

This runs the program normally but also starts a Java Debug server ( server=y ) on port 2718 ( address=2718 ).

In the method doTheDeed (see line 63 ) the program gets a reference to JDI’s VirtualMachineManager through which it gets a reference to the Socket connector ( transport=dt_socket ).

Ⅱ. Connection

The program connects to itself (see line 72 ) and gets a reference to the VirtualMachine. The same VirtualMachine it’s running on.

The self modification happens with the call to redefineClasses ( line 77 ) which surreptitiously changes the definition of the method dwim and gives us the new result of 2.718281828[4, 5] . There are also other changes in the byte-code but I’ll leave figuring them out as an exercise to the reader.

⒉ Java Agent

The Java Instrumentation API, first introduced in Java 1.5, “provides services that allow Java programming language agents to instrument programs running on the JVM. The mechanism for instrumentation is modification of the byte-codes of methods” ³ .

There are two ways to start Java Agents. The first is to start at the same time as the JVM does and the second allows us to start an Agent after the VM starts. We use the second technique. For details see the section §Starting an Agent After VM Startup in the Java Instrumentation page. I repeat the relevant sections:

An implementation may provide a mechanism to start agents sometime after the the VM has started. The details as to how this is initiated are implementation specific but typically the application has already started and its main method has already been invoked. In cases where an implementation supports the starting of agents after the VM has started the following applies:

The manifest of the agent JAR must contain the attribute Agent-Class in its main manfiest. The value of this attribute is the name of the agent class .

⒈ The agent class must implement a public static agentmain method.

⒉ The agentmain method has one of two possible signatures. The JVM first attempts to invoke the following method on the agent class:

public static void agentmain(String agentArgs, Instrumentation inst)

Ⅰ. Discovery

Now, to start a Java Agent after the VM starts, you need a .jar file. However, our implementation has only a single file. Not to worry, though. The Java SE provides us a class which can create Jar files— JarOutputStream . We use it to create a file named selfmod.jar . We ensure that this file’s manifest points to itself as the Java Agent (see line 71 , "Agent-Class: Replacement2" ).

To prevent messing up our surrounding environment, we also ensure the jar-file is deleted when the program ends (see line 96 , jarFile.toFile().deleteOnExit(); ).

Ⅱ. Connection

There are two parts to connecting to the VM. The first is defined by the Java Instrumentation API. This is what the static method agentmain does for us. In it we capture the java.lang.instrument.Instrumentation instance which later redefines the class to give us our new behavior. The second part, is in the program’s run instructions. When we run the program, we must specify the system property -Djdk.attach.allowAttachSelf=true because starting Java 9 the JDK prohibits self attachment by default .

(As an aside, I contemplated modifying my program so this wouldn’t be necessary. There were two options of which I’ll admit only the first occurred to me. One was the reflectively change the above flag. The second, more ingenious technique, is to start a child process and have that connect to the parent process. See Byte-Buddy ⁵ for examples of both. Eventually, in the interests of simplicity, I decided being explicit was better.)

Once the Agent starts, it gets the new byte-code from replacement() which ensures that dwim() returns a new result.

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网






〔美〕卡罗·白朗、Karel Baloun、卡罗·白朗 / 译言网 / 中国书籍出版社 / 2007 / 15.00元

Facebook是一个发源于哈佛大学、为全美大学生服务的社交网站。按照流量,这个网站在世界范围排名第8名;按照价值,业界对Facebook公司的估值超过10亿美元。Facebook创建于2004年2月,这样的高速增长成为当今互联网发展的一个奇迹。 《Inside Facebook》这本书是原作者卡罗·白朗(Karel Baloun)作为Facebook的第一位高级软件开发人员之一,在Face......一起来看看 《从零到百亿》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码




RGB HSV 互转工具