A "pure" object-oriented programming language treats each datum as a complete object. This makes for a uniform way to handle data, but it also adds overhead in both time and space that can sometimes be a significant disadvantage. If all you want to do is process integers, for example, you'd rather not bother with the heap space and implicit pointer indirection that using objects brings. This explains why C++ is preferable to SmallTalk for time-critical applications. Everything in SmallTalk, even an integer, is an object. C++, on the other hand, has efficient built-in integer and floating-point types, so you only incur object overhead when you explicitly ask for it.
Java allows you to choose either data-handling paradigm. It has built-in numeric types similar to those in C++, but it also provides "wrapper" classes for integral and floating-point objects (see Table 1). You usually only use the wrapper classes when complete objects are required, such as in collections. In this article I'll talk about built-in (or "primitive") types, wrappers, and operators.
The most significant feature of Java's primitive types is that they are truly portable. Because C was
conceived primarily as a systems programming language, its data types are tailored to each host system and
therefore vary in size from platform to platform. Building portable programs in C/C++ consequently
requires conditional compilation to determine the correct numeric type to use. In Java, on the other hand, an
int is a 32-bit two's-complement integer wherever you go. On systems where the size of a machine word
is not 32 bits, there could be a small performance penalty, but for applications that can spare a nanosecond
here and there (i.e., most applications) no one will ever notice. This uniform data size for primitive types
renders unnecessary the need to recompile code for different platforms, thus allowing pre-interpreted byte
codes to run on any platform that has a Java Virtual Machine. That's portability.
long to be at least 32 bits, and that's exactly what most compilers give
you, but Java mandates that you get 64 bits, resulting in a range over 4 billion times larger. C/C++
compilers are also at liberty to make float and double the same size, but again, Java guarantees distinct
sizes of 32 and 64 bits respectively. Because of the fixed size of primitive types, and because of the way
objects are created in Java, there is no need for a sizeof operator.
There is no unsigned qualifier in Java. Except for boolean and
char, all types are signed. A
boolean only holds two values, true and false, so signed-ness
doesn't apply. The char type
represents a Unicode character, which is the industry standard 16-bit encoding covering the range [0,
65535]. (For more on Unicode, see "The Standard C Library, Part 3", CUJ, February 1995, p. 94). All the
other numeric types are signed. This makes it a little tedious to enforce non-negative parameter values, but
it also avoids the signed/unsigned mismatch problems that occur all too frequently in C++.
A key use of unsigned in C++ is to zero-fill an integer when right-shifting. For example, the result of the
expression
x >> 3may differ among C/C++ compilers when x is signed and negative. Most implementations will propagate the sign bit in the upper 3 slots of the result. To guarantee zero-fill instead of sign extension, you must declare x as
unsigned. Once again, Java is more precise. With the normal right-shift operator (>>) you
always get sign extension. If you want zero-fill semantics you use the >>> operator, as in
x >>> 3
i is not
less than N in
the following expression, then a[i] is not even evaluated:
if (i < N && a[i] != 0)The most interesting feature of Java operators for C/C++ programmers is what's missing. You already know there is no
sizeof operator. Since Java does not uses explicit pointers you won't find the pointer
operations: unary & and *, and the -> operator. There is also no general comma operator, although you can
place comma-separated sequences of expressions in the initialization and iteration parts of a for loop as
you can in C. The only other surprise is the addition of the >>> operator mentioned above, and the
instanceof operator, discussed in the "Wrappers" section below.
The equality operator (==) requires some special care in Java. When comparing primitive types,
it is similar
to C with one improvement: Java's type system catches the following common error:
if (x = y) // Oops! Meant to type "=="
...
Java expects the target of a logical expression to of type boolean, but the type of an assignment is the
type of its (possibly promoted) left operand, so unless x is a boolean, the compiler will flag the typo
above as an error.
But when it comes to objects, == is rarely what you want. For example, if x and y
are instances of
class Foo, then the expression x == y compares the objects' handles, not the values of the objects'
fields. I'll reveal the secrets of this mystery in a future column. (If you know Lisp, the situation is
analogous to eq vs. equal in that language). For now, just remember not to use == to compare objects,
but to use the equals method instead.
A final word on operators that C programmers will find interesting concerns how Java evaluates operands.
Java always fully evaluates the operands of a binary operator left-to-right. Guaranteed. For example, in the
expression f() + g() you can count on any side effects of f() being complete before the call to
g(). C, on the other hand, makes no guarantees whatsoever on the order of evaluation of operands, which
is why the C Standards committee had to define sequence points to give programmers some control over
side effects. You don't need to worry about sequence points in Java.
public class Literals {
public static void main(String[] args) {
boolean b = true;
System.out.println(b);
int i1 = 15, i2 = 017, i3 = 0x0f;
System.out.println(i1 + "," + i2 + "," + i3);
long n = 1234567L;
System.out.println(n);
float x1 = 123.4567F, x2 = .1234567e3f;
System.out.println(x1 + "," + x2);
double y1 = 2.3d, y2 = .23e1;
System.out.println(y1 + "," + y2);
char c1 = 'a', c2 = '\u0061', c3 = '\141';
System.out.println(c1 + "," + c2 + "," + c3);
String s = "hello";
System.out.println(s);
}
}
/* Output:
true
15,15,15
1234567
123.4567,123.4567
2.3,2.3
a,a,a
hello
*/
As you can see, literals are similar to those in C++. The boolean literals are true
and false.
Unadorned numeric literals are of type double if they have a decimal point, and int if they don't. As in
C, a leading 0 denotes an octal int, and a 0x prefix introduces a hexadecimal number. Any numeric
expression can initialize a double. If you want to be explicit, you can use an f
suffix for float and a
d for double. In all cases the letter you use to identify the type of a numeric literal can be either upper
or lower case, but a lower case 'l' is discouraged since you can too easily mistake it for the numeral 1.
There are no suffixes for short and byte. You either assign an int literal in the correct range, or
you cast to the appropriate type as needed (see "Conversions and Casts" below). Character literals occur
between single quotes, as in C, except that you can also specify a Unicode character escape sequence with a
lower case u, as in '\u001c', just like in C++. Unicode escape sequences are always interpreted as
hexadecimal. Java does not support 32-bit ISO 10646 characters like C++ does (e.g., '\U0000001c').
Java supports most of the character escape sequences that C does, such as '\n', '\t', etc., except for
'\a' (audible bell) and '\v' (vertical tab).
The Java equivalent of const as far as variables are concerned is the final keyword, which suggests
that a variable cannot be changed (i.e., it has its "final" value). The following declares a constant int:
final int max = 32767;Local
final variables should always be initialized in their declaration. In future installments you'll see
alternative initialization techniques for class data members.
float to a double
or an int to a
long. Assigning the other way usually requires a cast, as in
// A "narrowing" conversion int i = 2; byte b = (byte) i; // note C-style castJava's cast syntax is identical to C (viz., the target type precedes the operand in parentheses). If a literal represents a value small enough to fit into the range of the target variable then a cast is not required, for example:
// 127 is an int literal; 128 would fail byte b = 127;If you substituted 128 for 127 above the compiler would complain, since 128 is outside the range of a byte. Narrowing conversions often result in a loss of information, including sign, since you lose bits off the top. For example, substituting 128 for 127 above would initialize b to -128. Starting with the bit representation of 128 (0.010000000, i.e., 24 zeroes, a 1, then 7 zeroes), the upper 24 zeroes are dropped, and the remaining 8 bits are interpreted as a signed integer.
Like standard numeric conversions in C, widening conversions occur implicitly for primitive types when you use them in an expression or as parameters to a function. Binary numeric and comparison operations, for example, follow this simple logic:
if either operand is a double then
convert the other to double if needed
else if either operand is float then
convert the other to float if needed
else if either operand is long then
convert the other to long if needed
else
convert both to int as needed
Passing a byte as an argument to a function expecting an int likewise causes an implicit conversion of
the byte to an int. You don't, however, get implicit conversions from a primitive type to a class object
like you do with single argument constructors in C++. For example, if you have a class Foo with a
constructor that takes a single int, and a function f that takes a single int argument, you can't call
f(1), nor even f((Foo)1). Why? Because objects must always be created via the new operator, so the
correct form is f(new Foo(1)). The key motivation for implicit conversions via single-arg constructors
in C++ was to complement operator overloading, which doesn't exist in Java. One less thing to worry
about.
public class Limits {
public static void main(String[] args) {
System.out.println("Byte: [" +
Byte.MIN_VALUE + "," +
Byte.MAX_VALUE + "]");
System.out.println("Character: [" +
Character.MIN_VALUE + "," +
Character.MAX_VALUE + "]");
System.out.println("Short: [" +
Short.MIN_VALUE + "," +
Short.MAX_VALUE + "]");
System.out.println("Integer: [" +
Integer.MIN_VALUE + "," +
Integer.MAX_VALUE + "]");
System.out.println("Long: [" +
Long.MIN_VALUE + "," +
Long.MAX_VALUE + "]");
System.out.println("Float: [" +
Float.MIN_VALUE + "," +
Float.MAX_VALUE + "]");
System.out.println("Double: [" +
Double.MIN_VALUE + "," +
Double.MAX_VALUE + "]");
}
}
/* Output:
Byte: [-128,127]
Character: [0,65535]
Short: [-32768,32767]
Integer: [-2147483648,2147483647]
Long: [-9223372036854775808,9223372036854775807]
Float: [1.4E-45,3.4028235E38]
Double: [4.9E-324,1.7976931348623157E308]
*/
Many classes in the Java library work with generic objects, or in other
words, with instances of the Object class. A class that does not
explicitly extend another class implicitly extends Object, so all
classes inherit from Object one way or the other. A collection class,
such as Vector, can act as a generic container in that it holds objects
of type Object, and can therefore hold any Java object. But primitive
types are not objects, so a Vector cannot hold integers or any other
numeric type directly. The work-around is to populate the Vector with
Integer objects, the wrapper for int. The following program uses this
technique to store 10 integers in a Vector.
import java.util.*; // Import the Vector class
public class UseVector {
public static void main(String[] args) {
Vector v = new Vector();
for (int i = 0; i < 10; ++i)
v.addElement(new Integer(i));
for (int i = 0; i < v.size(); ++i)
System.out.print(v.elementAt(i) + " ");
}
}
/* Output:
0 1 2 3 4 5 6 7 8 9
*/
The wrapper classes have a number of useful methods. Each integer-related type has an atoi-like
equivalent for converting a string representation of a number to a number. For example, Integer has
parseInt, Long has parseLong, and so on. Each wrapper type also has functions that return its value
in all numeric formats, e.g., byteValue, longValue, doubleValue, etc. The following program
converts strings to int and float.
public class ParseNums {
public static void main(String[] args) {
int i = Integer.parseInt("123");
int j = Integer.parseInt("4f", 16);
float x = Float.valueOf("123.45").floatValue();
System.out.println("i = " + i + "," +
"j = " + j + "," +
"x = " + x);
System.out.println("i = " + Integer.toBinaryString(i));
}
}
/* Output:
i = 123,j = 79,x = 123.45
i = 1111011
*/
There is no parse function for the floating-point wrapper types. All wrappers except Character have a
valueOf method that parses a string but returns a wrapper object, not a primitive, so I used that for the
Float example above.
The six numeric wrapper classes all inherit from the Number abstract class, which defines the methods
byteValue(), intValue(), etc. This allows you to define classes that can process any numeric type,
simply by writing to the interface of the Number class.
If you ever want to verify that an object is an instance of a particular type you can use the instanceof
operator. For example, if you have a function that takes a single Object parameter, f(Object n), say,
you can verify that the actual argument is of a type derived from Number, as follows:
void f(Object n) {
if (n instanceof Number)
// go ahead (n is a Byte, Short, Integer, etc.)
else
// not a Number, do something else
}
The instanceof operator returns true if its left operand is an instance of its right operand or of any
class that inherits from its right operand.
The Character wrapper contains a number of methods for classifying characters, similar to the
functionality found in the C header <ctype.h>, such as isDigit, isISOControl, isLetter,
isLetterOrDigit, isSpaceChar, isUpperCase, toUpperCase, etc. The methods
IsJavaIdentifierStart and IsJavaIdentifierPart identify a character as a valid part of an
identifier. Java identifiers can begin with a dollar sign, an underscore, or a valid "letter" from any Unicode
script. The following are valid Java identifiers:
A$very_long$identifier preço كنَIn the beginning of this article I mentioned that the wrappers incur a performance hit compared to using primitive types. To prove that point, the program in Listing 1 creates an identity vector in an array of 250,000 ints, computes its sum, and then displays the elapsed time using a Date object. Listing 2 has a program that does the same thing using Integer objects. When I run these programs on my 400 mHz Pentium II, the primitive version takes 50 milliseconds while the object version takes 1,710ms, which is slower by a factor 34. (I'm using JDK 1.1.7A). So use primitive types whenever you can!
import java.math.*; // For BigInteger
public class BigInt {
public static void main(String[] args) {
// Build a number with 40 digits:
String s = "12345678901234567890"
+ "12345678901234567890";
BigInteger b = new BigInteger(s);
BigInteger one = BigInteger.valueOf(1);
BigInteger two = BigInteger.valueOf(2);
BigInteger b2 = b.add(one);
System.out.println("b has " + b.bitCount() + " bits");
System.out.println("b mod 2 = " + b.mod(two));
System.out.println("b2 mod 2 = " + b2.mod(two));
System.out.println("b2 - b = " + b2.subtract(b));
}
}
/* Output:
b has 68 bits // 2's-complement
b mod 2 = 0 // even
b2 mod 2 = 1 // odd
b2 - b = 1
*/
The number of bits is derived from a two's-complement representation of the number. BigInteger also
has methods for shifting and primality testing.
>>> and instanceof operators. Wrappers provide the features that <limits.h> and
<float.h> do in C, and then some. In typical Java style, the functionality you need is where you expect
it to be: in appropriately named classes. You may sometimes end up doing a little more typing than you do
in C, what with class name prefixes and all, but you probably will do less hunting for the right identifiers.
| Type | Size (in bits) | Wrapper |
| boolean | N/A (values: true, false) | Boolean |
| char | 16 (Unicode encoding) | Character |
| byte | 8 | Byte |
| short | 16 | Short |
| int | 32 | Integer |
| long | 64 | Long |
| float | 32 | Float |
| double | 64 | Double |
| Operator Category | Operator List | |
| Postfix | [] . (parms) postfix: ++ -- |
|
| Unary | + - ~ ! postfix: ++ -- | |
| Creation/Conversion | new (type)x |
|
| Multiplicative | * / % |
|
| Additive | + - |
|
| Shift | << >> >>> |
|
| Relational | < > <= >= instanceof |
|
| Equality | == != |
|
| Bitwise AND | & |
|
| Bitwise XOR | ^ |
|
| Bitwise OR | | |
|
| Logical AND | && |
|
| Logical OR | || |
|
| Conditional | ?: |
|
| Assignment | = += -= *= /= %= <<= >>= >>>= &= ^= |= |
// Listing 1 - Processing an Array of Primitive Types
import java.util.*;
public class Primitives {
public static void main(String[] args) {
Date start = new Date();
final int N = 250000;
int[] a = new int[N];
for (int i = 0; i < N; ++i)
a[i] = i;
int sum = a[0];
for (int i = 1; i < N; ++i)
sum += a[i];
Date stop = new Date();
System.out.println(stop.getTime()
- start.getTime()); // 50
}
}
// Listing 2 - Processing an Array of Wrapper Objects
import java.util.*;
public class Wrappers {
public static void main(String[] args) {
Date start = new Date();
final int N = 250000;
Integer[] a = new Integer[N];
for (int i = 0; i < N; ++i)
a[i] = new Integer(i);
int sum = a[0].intValue();
for (int i = 1; i < N; ++i)
sum += a[i].intValue();
Date stop = new Date();
System.out.println(stop.getTime()
- start.getTime()); // 1710
}
}