The general form of an operator wrt a basis is the analog of a matrix with respect to a specific coordinate system.  Consider a rotation. You must specify the basis and then you can write down the rotation matrix.

 

When going to an infinite dimensional vector space

 

Now we specifically ask what the representation of the momentum operator is wrt the position basis. The result:

 

Aside: the argument x-x’ or x’-x is irrelevant

 

 

Using this in the above equation

 

 

Aside: Note that using the theory of functions that you can operate to the left or right with this representation by integrating by parts. This will provide a negative sign for left operation.

 

The representation of the momentum operator as a derivative can be confusing. Let us examine the relationship in more detail to see if this is indeed reasonable

 

The first step is to imagine the form of the translation operator. Here we simply we simply consider a 4 components out of an n-dimensional space.  We examine how the components xi are related to the translated components yi.

 

If one steps on step forward by one step (Choose the step to be x2-x1 so that the old component of basis e1 is now the component of basis e2.)

 

This is a shift of the function over one step. Similarly if you go to the continuum and consider the step

 

 

The only contribution is for the state exactly a distance  away.

 

Now for small

      (don’t worry about i or the sign)

 

Where we are looking to be able to express the translation in terms of some property of the current state times the amount or step for the translation.  This separates and its properties from the amount of the translation.  Operators that have this characteristic are referred to as generators.  They determine the amount-of-change/translational-step required.

 

In general, one can write an operator in a format similar to the matrix formulation used for finite dimensional vector spaces.

 

The use of the unit vectors in the matrix representation of an operator is not always used. Notice the matrix formalism requires the determination of all the ’s and to use these elements to find the ith new component by looking at all of the contributions from each part of the original vector. The particular contribution of the jth component to the ith is .  Detailing which parts of the old and new vector are included is the same as projecting out the contributions with an inner product. Thus the matrix representation could also be written as

 

 

Now we are looking for a way to separate the amount from some generic features of the operator.  From the intuitive standpoint a rotation involves a certain type of operation performed over a given angle.  Rotation can be separated from the angle.  One knows what a rotation is without specifying how much rotation is involved.  Can one mathematically separate operations in the same fashion?  For operations that are continuously connected to the identity one can try and use a Taylor series expansion. For the case of translation the momentum operator carries the essence of the translation and the parameter  provides the amount.  For very small steps the translation becomes

For large steps the translation takes the form

          (Factors of i and have been dropped.)

 

If you want to write a translation in terms of some generic result times the step then the operator cannot depend on the step size. Clearly to first order you could imagine that the slope at that point times the change  would give you the total change at that point. The first term 1 just involves keeping the amplitude for each state the same. The second term  finds the change per step multiplied by the step size. Examine the effect on the value y2 based on the original vector x. We need to find the change in the amplitudes and we base it on the change to the right averaged with the change to the left.  There might be

 

 

 

In principle one can imagine a matrix that links the adjacent states in such a way as to calculate the change and then use the change as the first approximation in computing the new vector by adding the change to the current vector.  It is an amazing feature of function theory that the value of a function can be computed at an arbitrary distance away if the function and all its derivatives at some point are known.  This feature then allows us to take rotations, time evolution and special translations and express them in terms of generators and the amount.

 

This discussion was meant to review some aspects of the definition of the momentum operator in terms of the derivative .   This note is meant to help clarify how the form of this operator emerges as a derivative. 

 

One nice feature of this development is that one can look at the matrix derived above

 

This is another way to see that the adjoint of the derivative operator changes sign. The above matrix is a REAL ANTISYMMETRIC  matrix so the transpose is minus the original. Note that if I multiply this matrix by I then the minus sign generated by the antisymmetry is canceled by the minus sign generated by the complex conjugate. Thus

The adjoint is the same as the original.

 

Another way to examine the sign is to look at translations and ask how the two functions that enter the inner product when expressed in position space must translate to keep the inner product unchanged.

 

Lets us first review the situation for a 2-d space and the rotation operator.

 

Consider the standard dot product of two vectors.

 

 

Now imagine that we want to rotate a vector C and then take the dot product. I can write this in a similar notation as

 

  rotate C through an angle θ and then take the dot product with A.

 

I can preserve the dot product but allow the rotation to occur on either C or A.  So I define what I mean to rotate A by demanding that the dot product doesn’t change. In this case if I rotate A by –θ (minus theta) the dot product remains the same.

Notice that if I use matrix multiplication

You can multiply to the left or right. The result of the rotation matrix on x,y is a rotation through θ while a,b times the matrix results in a rotation in the opposite direction - θ.

 

 

 

We see that the notation using row and column vectors and 2x2 matrices allow for matrix multiplication either to the left or to the right.  You can see that the effect of a matrix on a column is different than on a row and that the difference when the matrix represents a rotation corresponds to a change in sign for θ.

 

Let us remember that the translation of a function

 

 

What happens if we translate a function A to the right by an amount ε?

We could indeed also move the function B and get the same overlay so long as we moved it in the opposite direction.  So in order to keep the inner product the same as we apply a translation either to the BRA or KET space we must step in the opposite direction.

 move B to the left.

 move A to the right.

 

This shows that we need to apply a different rule to the BRA space for a derivative so that translations are in opposite directions in each space.

 

 we need to remember to add a minus sign to the usual definition of the derivative operator when we apply it to the left.

 

 

This is the matrix element of the derivative operator. The default meaning is that it operates to the right with a positive sign as stated above and to move it to the left one needs to add a minus sign.

 

There are two intertwined questions:

  1. How does one apply an operator either to the left or to the right?
  2. What is the adjoint of an operator?

 

The adjoint is the form of the operator that would produce the same vector in the bra space as was produced by the original in the ket. The translation operator for the ket space moves a state ε to the right. So the operation on both A and B in ket space results in right translated states. However the right translation operator must translate a bra to the left. Thus

 

 

In other words if the operator produces a different impact on the bra space then on the ket space what is the operator that has the same effect.