String manipulation is one of the most common activities in computer programming. So in this blog we will talk about String manipulation , String concatenation, Operator overloading, StringBuilder.
Immutable Strings
Objects of the String class are immutable. Every method of String class that appears to modify a String actually creates and returns a brand new String object containing the modification. The original String is left untouched.
Let's see a example
package blog; public class Immutable { public static String uperCase(String str) { return str.toUpperCase(); } public static void main(String args[]) { String original = "santosh"; System.out.println(original); String modified = uperCase(original); System.out.println(modified); System.out.println(original); } }
Output
santosh
SANTOSH
santosh
When original is passed to userCase() it's actually a copy of the reference to original. The Object this reference is connected to stays in a single physical location. The reference are copied as they are passed around.
Looking at the definition for upserCase(), you can see that the reference that's passed in has name str, and it exists for only as long as the body of uperCase() is being executed.
When userCase() completes, the local reference s vanishes. uperCase() returns the result, which is the original string with all the characters set to uppercase.
It's actually returns a reference to the result. But the reference that it returns is of a new object, and the original String is left alone unchanged.
Overloading '+' vs. StringBuilder
Since String objects are immutable, you can alias to a particular String as many times as you want. Because a String is read-only, there is no possibility that one reference will change something that will affect the other references.
Immutability can have efficiency issues. A case in point is the operator '+' that has been overloaded for String objects.
Overloading means that an operation has been given an extra meaning when used with a particular class.
The '+' and '+=' for String are the only operators that are overloaded in java, and java does not allow the programmer to overload any others.
Note : C++ allows the programmer to overload operators at will. because this can often be a complicated process. That's why java designers think that shouldn't be included in Java.
The '+' operator allows you to concatenate Strings :
package blog; public class Concatenation { public static void main(String args[]) { String apple = "apple "; String str = "papaya " + apple + "etc " + 40; System.out.println(str); } }
Output
papaya apple etc 40
You could imagine how this might work. The String "papaya" could have a method append() that creates a new String object containing "papaya" concatenated with the contents of apple. The new String object would then create another new String that added "etc", and so on.
This would certainly work, but it requires the creation of a lot of String objects just to put together this new String, and then you have a bunch of intermediate String objects that need to be garbage collected. I suspect that the java designers tried this approach first. I also suspect that they discovered it delivered unacceptable performance.
To see what really happens, you can decompile the above code using the javap tool that comes as part of the JDK. Here's the command line : javap -c Concatenation
The -c flag will produce the JVM byte codes. After we strip out the parts we are not interested in and do a bit of editing, here are the relevant byte codes.
public class Concatenation {
public static void main(java.lang.String[]);
Code:
0: ldc #2 // String apple
2: astore_1
3: new #3 // class java/lang/StringBuilder
6: dup
7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
10: ldc #5 // String papaya
12: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: aload_1
16: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: ldc #7 // String etc
21: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: bipush 40
26: invokevirtual #8 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
29: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
32: astore_2
33: getstatic #10 // Field java/lang/System.out:Ljava/io/PrintStream;
36: aload_2
37: invokevirtual #11 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
40: return
}
If you have experience with assembly language, this may look familiar to you- Statement like dup and invokevirtual are the Java Virtual Machine (JVM) equivalent of assembly language. If you have never seen assembly language, don't worry about it- The important part to notice is the introduction of java.lang.StringBuilder class by the compiler. There was no mention of StringBuilder in the source code, but the compiler decided to use it anyway, because it is much more efficient.
In this case, the compiler creates a StringBuilder object to build the String str, and call append() four times, one for each of the pieces. Finally, it calls toString() to produce the result, which it stores as str.
Before you assume that you should just use String everywhere and that the compiler will make everything efficient, let's look a little more closely at what the compiler is doing.
Here is an example that produces a String result in two ways: using String, and by hand-coding with StringBuilder:
package blog; public class WithStringBuilder { public String implicit(String fields[]) { String result = ""; for (int i = 0; i < fields.length; i++) { result += fields[i]; } return result; } public String explicit(String fields[]) { StringBuilder result = new StringBuilder(); for (int i = 0; i < fields.length; i++) { result.append(fields[i]); } return result.toString(); } }
Now if you run javap -c WithStringBuilder
You can see the simplified code for the two different methods. First, implicit()
public java.lang.String implicit(java.lang.String[]);
Code:
0: ldc #2 // String
2: astore_2
3: iconst_0
4: istore_3
5: iload_3
6: aload_1
7: arraylength
8: if_icmpge 38
11: new #3 // class java/lang/StringBuilder
14: dup
15: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
18: aload_2
19: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
22: aload_1
23: iload_3
24: aaload
25: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
31: astore_2
32: iinc 3, 1
35: goto 5
38: aload_2
39: areturn
Notice 8: and 35:, which together form a loop. 8: does an "integer compare greater than or equal to" of the operands on the stack and jump to 38: when the loop is done. 35: is a goto back to the beginning of the loop, at the 5:.
The important thing to note is that the StringBuilder construction happens inside this loop, which means you are going to get a new StringBuilder object every time you pass through the loop.
Here are the byte codes for explicit();
public java.lang.String explicit(java.lang.String[]);
Code:
0: new #3 // class java/lang/StringBuilder
3: dup
4: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
7: astore_2
8: iconst_0
9: istore_3
10: iload_3
11: aload_1
12: arraylength
13: if_icmpge 30
16: aload_2
17: aload_1
18: iload_3
19: aaload
20: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
23: pop
24: iinc 3, 1
27: goto 10
30: aload_2
31: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
34: areturn
Not only is the loop code shorter and simpler, the method only creates a single StringBuilder object. Creating an explicit StringBuilder also allows you to preallocate its size if you have extra information about how big it might need to be, so that it does not need to constantly reallocate the buffer.
Thus, when you create a toString() method, if the operations are simple ones that the compiler can figure out on its own, you can generally rely on the compiler to build the result in a reasonable fashion. But if looping is involved, you should explicitly use a StringBuilder in your code and than convert it to String using toString().
StringBuilder was introduced in Java SE5. Prior to this, Java used StringBuffer, which ensure thread safety and so was significantly more expensive.
Stay tuned to know more on String in java
You can also read below topic to prepare yourself for interviews.
No comments:
Post a Comment