### The problem

This weekend I was doing some game programming and got stuck on the problem of finding out what in the game world the player targets with the mouse. This is called picking/object selectionÂ and is not as straight forward as one might first think.Â

Almost all of the examplesÂ I could find online was using old school OpenGL (fixed function pipeline and no shaders), and/or gluUnProjectÂ which might be hard/impossible to use if you are using modern OpenGL.Â So now that I finally got it working I thought I would share the complete solution and hopefully help someone else!

The code below is Java using LWJGL classes, but the concept should translate to any language.**(Complete codeÂ at the bottom of the post!)**

### Step 1: Normalized screen coordinates

When the player clicks the screen you will somewhere get the X and Y in screen coordinates and these must first be transformed to normalized opengl viewport coordinates.

- Screen: From [0,0] to for example [800, 600] (where 0,0 is is platform dependant)
- Viewport/normalized: From [-1,-1] to [1,1] with [0,0] in the center of the screen

This can be done by the following code:

private Vector2f normalizeScreenCoords(float x, float y) {
return new Vector2f(2.0f*x / resX - 1, 2.0f*y / resY - 1);
}

### Step 2: Construct points on cameras near and far plane

Next we need to construct two points that will later be transformed into world space. Using the normalized x and y coordinates, and z values -1.0 and 1.0 corresponds to the points of interest on the near and far plane of the camera. Since it is points we’re dealing with, the w component is set to 1.

Vector4f screenNear = new Vector4f(normalized.x, normalized.y, -1.0f, 1.0f);
Vector4f screenFar = new Vector4f(normalized.x, normalized.y, 1.0f, 1.0f);

### Step 3: Transform to world space

Now we transform the two points created in Step 2 to world space coordinates. First we get the inverse of the view-projection matrix. How to get a hold of the projection and view matrix depends on each individual application, but I had these stored in my camera class so that is where I put the below method:

private Matrix4f getScreenToWorldMatrix() {
Matrix4f screenToWorld = Matrix4f.mul(projectionMatrix, viewMatrixInv, null);
return (Matrix4f) screenToWorld.invert();
}

Next we do the transformation with the screenToWorld matrix and perform perspective correction by dividing xyz by the w component.

Vector4f near = Matrix4f.transform(screenToWorld, screenNear, null);
Vector4f far = Matrix4f.transform(screenToWorld, screenFar, null);
Vector3f nearNorm = new Vector3f(near.x/near.w, near.y/near.w, near.z/near.w);
Vector3f farNorm = new Vector3f(far.x/far.w, far.y/far.w, far.z/far.w);

### Step 4: Construct vector that goes into the scene

Now we have two points in world coordinates, at different depth (z) that corresponds to the x and y of the mouse click. From these points we can create a “ray” that goes from the camera into the scene. For a 3D application you then need to find the first object that this ray intersects.

Vector3f dir = Vector3f.sub(farNorm, nearNorm, null);
dir.normalise();

### Step 5: Find out where the line intersects the (0,0,1) plane

In my game (which is more or less a 2D game) all objects are placed on the plane where z = 0.

Therefore I want the x and y coordinates where the ray intersects this plane. This is described on 510-978-2836 (I use p0 = (0,0,0) and l0 = the near-plane point)

Vector3f normal = new Vector3f(0,0,1);
float d = -Vector3f.dot(nearNorm, normal) / Vector3f.dot(dir, normal);
Vector3f lScaled = (Vector3f)dir.scale(d);
Vector3f worldCoords = Vector3f.add(lScaled, nearNorm, null);

Thats it!

### Complete code

public Vector3f screenToWorld(float screenX, float screenY) {
/ Get opengl screen coordinates (-1.0 to 1.0)
Vector2f normalized = normalizeScreenCoords(screenX, screenY);
/ Set up two points that will correspond to the near and far plane of the
/ camera when transformed by the screen to world matrix.
Vector4f screenNear = new Vector4f(normalized.x, normalized.y, -1.0f, 1.0f);
Vector4f screenFar = new Vector4f(normalized.x, normalized.y, 1.0f, 1.0f);
Matrix4f screenToWorld = getScreenToWorldMatrix();
Vector4f near = Matrix4f.transform(screenToWorld, screenNear, null);
Vector4f far = Matrix4f.transform(screenToWorld, screenFar, null);
/ Normalize vectors with respect to the w component (perspective correction)
Vector3f nearNorm = new Vector3f(near.x/near.w, near.y/near.w, near.z/near.w);
Vector3f farNorm = new Vector3f(far.x/far.w, far.y/far.w, far.z/far.w);
/ Calculate the direction of the vector going through both points
/ This is the ray from the point on the screen "into" the scene
Vector3f dir = Vector3f.sub(farNorm, nearNorm, null);
dir.normalise();
/ Figure out where the ray intersects the plane (0, 0, 1)
Vector3f normal = new Vector3f(0,0,1);
float d = -Vector3f.dot(nearNorm, normal) / Vector3f.dot(dir, normal);
Vector3f ls = (Vector3f)dir.scale(d);
return Vector3f.add(ls, nearNorm, null);
}
private Vector2f normalizeScreenCoords(float x, float y) {
return new Vector2f(2.0f*x / resX - 1, 2.0f*y / resY - 1);
}
private Matrix4f getScreenToWorldMatrix() {
Matrix4f screenToWorld = Matrix4f.mul(projectionMatrix, viewMatrixInv, null);
return (Matrix4f) screenToWorld.invert();
}