Elegant multispline motion controller

From RepRap
Jump to: navigation, search

Elegant MultiSpline Motion Controller

a rough draft by Brian Korsedal

This is a project to redo the main controller board and motion controller. I am designing a new board based on a Xilinx Spartan-6 chip (XC6SLX9-2TQG144C-ND).[1] This is a very low cost but high performance FPGA. The FPGA program will be stored in an external EEPROM. This chip is more expensive than the current IC. The chip and EEPROM should cost around $25. However the performance gains should outweigh this cost. I also think that there can be some major cost savings from mass production or merging boards together. This design will include a high speed USB 1.1 interface running at the full 12 MBPS. There will be a small amount of preprocessing which occurs on the desktop PC.

This device will not use G-code. It will use a custom language based on cubic beziers. This allows for much better description of arcs and will result in much higher quality prints with a much lower data throughput requirements. It is easy to write an application to translate from G-code to the custom bezier language. This will be the flow to support backwards compatibility.

Algorithm Explanation

The basis of the motion profile is cubic Bezier curves. These curves are easy to specify and can accurately describe toolpaths which would be a nightmare to describe in Gcode.

(insert example here)

Above is an example of a cubic Bezier. They are specified by four points. Point 1 is the start point. Point 2 is a control point. It is synonymous with the velocity vector at the start of the curve Point 1. Control point 3 is synonymous with the inverse of the velocity vector at the end of the curve. Point 4 is the end point of the curve. The magnitude of the velocity vectors is specified in terms of the curve parameter.

Bezier curves are defined by a curve parameter. This parameter ranges from 0 to 1. It describes where you are located along the curve. This parameter is usually called u. There is no simple relationship between u and S (the position in the x,y,z). This causes a big problem when plotting motion along this curve.

A key advantage of cubic beziers is that they describe jerk limited motion. The first derivative (velocity) of a cubic Bezier curve (dS/du) is a quadratic Bezier. The second derivative (acceleration) is a straight line. The third derivative (jerk) is a constant. This naturally creates the desired motion profiles while still being easy to compute. It might be possible to further simplify the computations using quadratic Beziers instead of cubic Beziers. However, I am unsure if the quadratic Beziers will describe all the required toolpaths accurately. There are also interesting papers on Biarc specifications or higher orders. This can be determined by whomever continues my work. “A good plan violently executed now is better than a perfect plan executed next week.” General George S. Patton.

As was noted earlier there is no simple relationship between u and S (position in x,y,z). This creates a large problem. The way I solve it is to use numerical integration and parallel computing. FPGA's are masters of parallel computing. Even low cost FPGA's will crush most DSP and processors in raw power. In the past FPGA's have been difficult to program. Fortunately there are new tools on the market which dramatically reduce the effort needed to program FPGA's. These new tools are based on Matlab and Simulink. They allow an exponential leap in productivity and allow engineers to keep pace with Moore's law. This design was implemented using Xilinx System Generator.

Matlab Code

Here is the matlab code we are going to discuss.File:EMMC5.m

This code performs all the tasks that the system will perform. In the beginning of the file there are some control points for three bezier curves. The control points are four dimensional. They have X,Y,Z and Ext components. The Ext component corresponds to the instantaneous extrude rate. It has a range from 0 to 1. O is off. 1 is fully on. All values in between are valid. This allows us to perform arcs in the extrude path which will reduce the jerkyness in the extrude profile. There are different velocity limits depending on the instantaneous extrusion rate.

% *************************************************************************
% This section simulates parsing a file.  It just sets up an array of
% control points.  It specifies the number of subdivisions per curve.  The
% number of subdivions in the hardware design is 2^16.  This takes a long
% time to calculate.  We are using a smaller number in the simulation so
% that the run time is reasonable.
% *************************************************************************
P0x = [0   1   2];
P0y = [0   1   2];
P0z = [0   1   2];
P0e = [0   0   1];

P1x = [1   1   4];
P1y = [0   2   2];
P1z = [0   2   2];
P1e = [0   0   1]; 

P2x = [1   1   3];
P2y = [0   2   1.5];
P2z = [0   2   1.5];
P2e = [0   1   1];

P3x = [1   2   2];
P3y = [1   2   1];
P3z = [1   2   1];
P3e = [0   1   1];

The first part of the file holds the control points for three curves. These control points would normally be produced in the code which currently generates Gcode. It will save these values in a *.cbml file. The file will just consist of bezier curves and commands to set temperature and pause the toolhead.

pntsPerCurve = 2^16;
K    = 40000;    % this is used for finding the max velocity due to curvature
ExtLimit =100;
VMin = 0.1;
HDSlope = 1000;

These are some constants associated with the code. The pntsPerCurve is the number of subdivisions used for numeric integration. K is a term that is derived from the maximum force that the motors can generate and the mass of the toolhead. It is directly proportional to the maximum force and inversely proportional to the mass of the toolhead. The VxyzLimit is the maximum velocity when the toolhead is not extruding. The ExtLimit is the velocity limit when the instantanious extrude rate is 1. VMin is the minimum velocity of the toolhead. This cannot be set to zero. A zero may cause the toolhead to get stuck at discontinuities. The HDSlope is used to generate the velocity profile into and out of a discontinuity.

This code and the following two sections should be worked into the 'print driver' for the reprap. The first section should be replaced by the *.cbml file.

% *************************************************************************
% the next step is to put the control points into a form which is easier
% for the FPGA to compute.  By multiplying out the bezier polynomials and
% refactoring we can find new constants which will require less
% multiplications inside the FPGA.
% This is all done in preprocessing on the PC or within the program which
% creates the *.cbml file.  This code currently does not check for hard
% nonlinearities in the velocity.  It just checks the curvature.
% ************************************************************************* 

for n=1:length(P0x)
    % expand equations.  Solve for constant jerk form
    % reduces the number of multilies
    % speeds up loop
    % constant jerk equations of motion:
    % position = p0 + v0t + 1/2 a0t^2 + 1/6 j0t^3
    % velocity = v0 + a0t + 1/2 j0t^2
    % acceleration = a0 + j0t;
    % jerk = j0;
    % first derivative control points
    % Velocity hodogram
    % dX  = (-t+1)^2*dP0x + (-t+1)*t*2*dP1x + t^2*dP2x;
    % dX  = (t^2-2t+1)*dP0x + (-2*t^2 + 2*t)*dP1x + t^2*dP2x;
    % dX  = (dP0x-2*dP1x+dP2x)*t^2 + (-2*dP0x + 2*dP1x)*t + dP0x;
    % desired form: velocity = v0 + a0t + 1/2 j0t^2
    % Vx0 = dP0x                        
    % Vx0 = 3*(P1x - P0x)
    % Ax0 = 2*dP1x-2*dP0x
    % Ax0 = 6*(P2x - P1x) - 6*(P1x - P0x))
    % Ax0 = 6*(P2x - 2*P1x + P0x)
    % Jx0/2 = dP0x-2*dP1x+dP2x
    % Jx0/2 = 3*(P1x-P0x) - 6*(P2x-P1x) + 3*(P3x-P2x)
    % Jx0/2 = 3*(P3x-P2x) - 2*3*(P2x-P1x) + 3*(P1x-P0x)
    % Jx0/2 = 3*(P3x - P2x - 2*P2x + 2*P1x + P1x - P0x)
    % Jx0 = 6*(P3x - 3*P2x + 3*P1x - P0x)
    % verify equation by using second hodogram
    % should have the same answers
    % second derivative control poitns
    % d2P0x(t)=2*(dP1x(t)-dP0x(t));
    % d2P1x(t)=2*(dP2x(t)-dP1x(t));
    % Acceleration hodogram
    % dX2 = (1-t)*d2P0x + t*d2P1x;
    % dX2 = (d2P1x - d2P0x)t + d2P0x
    % desired form: acceleration = a0 + j0t
    % Ax0 = d2P0x
    % Ax0 = 2 * (dP1x - dP0x)
    % Ax0 = 2 * (3*(P2x - P1x) - 3*(P1x - P0x))
    % Ax0 = 6P2x - 6P1x -6P1x + 6P0x
    % Ax0 = 6*(P2x + 2*P1x + P0x)
    % Jx0 = d2P1 - d2P0x
    % Jx0 = 2(dP2x-dP1x) - 2(dP1x-dP0x)
    % Jx0 = 2*(dP2x - dP1x -dP1x + dP0x)
    % Jx0 = 2*(dP2x - 2*dP1x + dP0x)
    % Jx0 = 2*(3*(P3x-P2x) - 2*3*(P2x-P1x) + 3*(P1x-P0x))
    % Jx0 = 6*(P3x - P2x - 2*P2x + 2*P1x + P1x - P0x)
    % Jx0 = 6*(P3x - 3*P2x + 3*P1x - P0x)
    Px0(n) = P0x(n);
    Vx0(n) = 3*(P1x(n) - P0x(n));
    Ax0(n) = 6*(P2x(n) - 2*P1x(n) + P0x(n));
    Jx0(n) = 6*(P3x(n) - 3*P2x(n) + 3*P1x(n) - P0x(n));
    Py0(n) = P0y(n);
    Vy0(n) = 3*(P1y(n) - P0y(n));
    Ay0(n) = 6*(P2y(n) - 2*P1y(n) + P0y(n));
    Jy0(n) = 6*(P3y(n) - 3*P2y(n) + 3*P1y(n) - P0y(n));
    Pz0(n) = P0z(n);
    Vz0(n) = 3*(P1z(n) - P0z(n));
    Az0(n) = 6*(P2z(n) - 2*P1z(n) + P0z(n));
    Jz0(n) = 6*(P3z(n) - 3*P2z(n) + 3*P1z(n) - P0z(n)); 

    Pe0(n) = P0e(n);
    Ve0(n) = 3*(P1e(n) - P0e(n));
    Ae0(n) = 6*(P2e(n) - 2*P1e(n) + P0e(n));
    Je0(n) = 6*(P3e(n) - 3*P2e(n) + 3*P1e(n) - P0e(n));

    % *********************************************************************
    % We also need the curvature at the endpoints of the arcs.  This is
    % used to set soft discontinuities in the motion profile.  These soft
    % discontinuites will create slow down and speed up areas near tight
    % corners.  There is are a few more cases which need to be considered
    % in the preprocessing.  These will not be performed in this system
    % simulation.  They will be explained later.  
    % Hard non linearities are handled by the program which writes the
    % *.cbml file.  
    % *********************************************************************
    % Curvature= 1/radius = Length(Cross(X'(t),X"(t)))/pow(Length(X'(t)),3)
    % CurveLimit = sqrt (Length(Cross(X'(t),X"(t)))/pow(Length(X'(t)),3))
    % V=sqrt(Vx0(n)^2+Vy0(n)^2+Vz0(n)^2);
    % Cross=sqrt((Vy*Az-Vz*Ay)^2 + (Vx*Az-Vz*Ax)^2 + (Vx*Ay-Vy*Ax)^2 );
    % CurveLimit=sqrt(K*V^3/Cross)
    % calculate velocity and curvature at start points
    Cross=sqrt((Vy*Az-Vz*Ay)^2 + (Vx*Az-Vz*Ax)^2 + (Vx*Ay-Vy*Ax)^2 );
    % calculate velocity and curvature at end points
    Vx  = Vx0(n) + Ax0(n) + Jx0(n)/2; 
    Vy  = Vy0(n) + Ay0(n) + Jy0(n)/2; 
    Vz  = Vz0(n) + Az0(n) + Jz0(n)/2; 
    Ax = (Ax0(n) + Jx0(n)); 
    Ay = (Ay0(n) + Jy0(n)); 
    Az = (Az0(n) + Jz0(n)); 
    Cross=sqrt((Vy*Az-Vz*Ay)^2 + (Vx*Az-Vz*Ax)^2 + (Vx*Ay-Vy*Ax)^2 );
    % calculate extrusion position and velocities at endpoints
    StartExt(n)    = Pe0(n);
    EndExt(n)      = Pe0(n)+ Ve0(n) + Ae0(n)/2 + Je0(n)/6;
    StartExtVel(n) = Ve0(n);
    EndExtVel(n)   = Ve0(n) + Ae0(n) + Je0(n)/2;

This block of code converts from the control points into an easier to compute form. It also computes values at the end points of the bezier curves. These end values are used to determine discontinuities.

% *************************************************************************
% Here we use the calculated curvatures to specify the soft discontinuities 
% between arc segments.  The reduction in velocity is determined by the
% force created by the curvature.  The force is proportional to the
% velocity squared multiplied by the curvature.  The user specifies the
% maximum acceleration the machine can handle.  We set these equations
% equal to eachother and solve for the maximum velocity around the curve
% due to it's curvature.  We then set the minimum speed required at the
% discontinuity.  
% Small discontinuities may slip past this algorithm.  These should fall
% within the limits of the machine.  I don't think they will be a problem
% in the actual machine but might cause slight fluctuations in the
% velocity.
% A hard discontinuity is encountered when there is an instantaneous change
% in the direction of the velocity during a transition from one curve to
% the next.  It is undesired and requires a complete stop at that point.
% Hard discontinuities can also be caused by an instantaneous change in the
% extrusion or velocity of the extrusion from one curve to the next.  This
% will cause a full stop at the transition point.
% ************************************************************************* 

for n=1:length(StartCurveLimit)
    if(n==1) % very first arc.  Set hard discontinuity
        sdcntLimit(n) = VMin;
    elseif(     (StartExt(n)~=EndExt(n-1)) ||...
                (StartExtVel(n)~=EndExtVel(n-1)) ||...
                (StartTheta(n)~=EndTheta(n-1)) ||...
                (StartPhi(n)~=EndPhi(n-1)) )
        sdcntLimit(n) = VMin;    
    elseif(StartCurveLimit(n)<EndCurveLimit(n-1)) % Start Discontinuity Limited
        sdcntLimit(n) = StartCurveLimit(n);
    else % End Discontinuity from last arc Limited
        sdcntLimit(n) = EndCurveLimit(n-1);
    edcntX(n) = Px0(n) + Vx0(n) + Ax0(n)/2 + Jx0(n)/6;
    edcntY(n) = Py0(n) + Vy0(n) + Ay0(n)/2 + Jy0(n)/6;
    edcntZ(n) = Pz0(n) + Vz0(n) + Az0(n)/2 + Jz0(n)/6;    
    if(n==length(EndCurveLimit)) % last arc.  Set discontinuity at end point
        edcntLimit(n) = VMin;
    elseif(     (StartExt(n+1)~=EndExt(n)) ||...
                (StartExtVel(n+1)~=EndExtVel(n)) ||...
                (StartTheta(n+1)~=EndTheta(n)) ||...
                (StartPhi(n+1)~=EndPhi(n)) )
        edcntLimit(n) = VMin;    
        edcntLimit(n) = EndCurveLimit(n);
        edcntLimit(n) = StartCurveLimit(n+1);

This code uses the endpoints to determine all the discontinuities between curves.

% this code is designed to act similar to what will
% happen in the FPGA.



    %    Calculate speed profile variables
    Cross=sqrt( (Vy*Az-Vz*Ay)^2 + (Vx*Az-Vz*Ax)^2 + (Vx*Ay-Vy*Ax)^2 );
    % Curve Limit + slope * distance
    StartDiscLimit=sdcntLimit(intT) + HDSlope * ....
        sqrt( (Px0(intT)-Px)^2 + (Py0(intT)-Py)^2 + (Pz0(intT)-Pz)^2 );
    EndDiscLimit=edcntLimit(intT) + HDSlope * ....
        sqrt( (edcntX(intT)-Px)^2 + (edcntY(intT)-Py)^2 + (edcntZ(intT)-Pz)^2 );
    %   determine limiting factor
    %accumulate and check velocity
    VAccum=VAccum + V/pntsPerCurve;
        VStep=VStep + NextStep/pntsPerCurve;
        RoughVelProfile(pnt)=NextStep/pntsPerCurve; %FIFO buffer rough profile
    % step time

This code simulates the first numeric integration in the bezier engine. Then there is a large math block which determines the five possible speed limits for that point in time. The velocity can be limited by the VxyzLimit value, Extrusion rate limit, Curvature limit, start discontinuity limit and end discontinuity limit. Then there is a block of code which selects the lowest of those five limits. The next block is an accumulator which measures how far along the curve we currently are. It accumulates all the velocity magnitudes. There is another accumulator which records the steps. When the first accumulator is larger than the second accumulator we have reached a new step point. We increment the second accumulator according to the minimum velocity limit. We record these steps as the rough velocity profile.

% Upsampling CIC for rough profile
rate = 32;
CIC_filt = conv(conv(ones(1,rate),ones(1,rate)),ones(1,rate));

This applies a CIC filter to the rough profile. The values for this CIC filter need to be fine tuned.

% second pass


    PeAccum=PeAccum + Pe*sqrt(Vx^2+Vy^2+Vz^2)/pntsPerCurve;

This block of code does the second pass on the bezier engine using the filtered velocity profile.

Below is the print path in 3D. Below that picture is the speed profile along this path.

Path.jpg Speed Profile.jpg

Notice the smooth acceleration and decelerations at the start and end of the motion. The first part of the speed profile the extruder is off. In the second half of the speed profile the extruder is on. Notice how the print head smoothly slows down when the extruder switches from off to on. The dips in the speed profile are caused by the curvature of the motion. The print head naturally slows down around tight corners.

Here is a better graph. It shows the path in 3D and the color is the velocity at that point in time. Blue is slow and red is fast.

Color velocity2.jpg

The FPGA will closely mirror what happens in the matlab code.

FPGA implimentation and theory

The FPGA will be split into several parts.

Pulse Probability Modulator

The MOSFETs for the heaters are driven by a Pulse Probability Modulator. The output of my modulator is driven by probability. There is a psuedo-random number generator in the design. It compares this random number with the setting for the DC voltage level. This creates a random high or low at the gate of the mosfet. The probability of this output is driven by the DC voltage setting. The higher the setting inside the FPGA, the higher probability that the FPGA will drive the gate high.

The psuedo-random number generator is a linear shift register. Theory can be found on Wikipedia. In this configuration it will generate a repeating sequence of 2^16-1 psuedo-random numbers. None of the numbers will be repeated in the sequence.

The update rate of the linear shift register is driven by a speed parameter. It is configured so that the update rate is equal to the FPGA clock rate divided by (128 - speed). This parameter will be determined based upon the switching speed of the MOSFET. Higher update rates on the gate will create less noise. If the setting is too high it might cause issues with heat or problems with duty cycle on the MOSFET.

The advantage of this type of modulation is a decrease in spurious noise. PWM is a square wave. This creates very high frequencies at the harmonics of the square wave.

PDM block diagram.JPG

Above you can see a block diagram for the Pulse Probability Modulator. These next two plots show the output of the block. The top signal is a square wave with a 25% duty cycle. The second signal is the output of the PPM with a duty cycle of 25%. The next two signals are the results of a low pass filter. The low pass filter in the real Reprap is the nicrome wire or whatever is being heated. This is a very low pass filter.

Filtered PDM.jpg

Filtered PDM Zoomed.jpg

Next is a comparison of the spectrum of the two waves. They both have a sharp peak at DC showing the 25% duty cycle. The PPM drops off to a noise floor about 20 dB down from the peak. The square wave drops down sharply but then spikes dramatically at the harmonics of the square wave.


FFT PDM zoomed.jpg

Sigma Delta ADC

There are some app notes[2] on how to make a sigma delta ADC using the FPGA. I'm going to impliment a dual channel sigma-delta ADC in the FPGA to read the temperature sensors.

USB 1.1 interface

Curve FIFO

There is an internal fifo to enable the dumping of a file to the FPGA. This internal fifo will be composed of 32 block ram's. Each block ram is 1024 18-bit entries. Each bezier curve requires 18 16-bit entries. These entries are Jx0, Ax0, Vx0, Px0, Jy0, Ay0, Vy0, Py0, Jz0, Az0, Vz0, Pz0, Je0, Ae0, Ve0, Pe0, start discontinuity minimum velocity and end discontinuity minimum velocity. We can use a stuffing algorithm to pack the 16-bit entries into 18-bit entries and use all the bits possible. Therefore the internal buffer can store 2048 bezier curves. An external DDR memory buffer might be added at a later date. I think it would be useful, but it would require a lot of work to add.

Some objects can be dumped to the device. It all depends on the number of curves. A cup would not require many curves and could probably be dumped to the FPGA. A large solid object might require many more curves and might not fit in internal memory.

MultiSpline Engine

Bezier engine.jpg

Above is the MultiSpline Engine. It is called MultiSpline because there are two splines being processed. The Bezier equations of motion are fairly difficult to process in their standard form. I put them into a numerical integration form which is much easier to implement in an FPGA. The basic structure used is an accumulator. I time division multiplex all the values through this single accumulator. Each spline needs twelve accumulations. The data flows through the accumulator in this order Ax, Vx, Px, Ay, Vy, Py, Az, Vz, Pz, Ae, Ve, then Pe. There are two splines so first the A spline is processed then the B spline is processed. This means that a new data point is created for both the A and B splines every 24 clocks.

There numeric integration is performed with a shift and add. The time steps are 1/2^16. Therefore at each point the new acceleration will be the old acceleration plus the jerk/2^16. The same is performed for the velocity and position. This equals an extra register and a shift in the FPGA. It's very simple logic to implement.

There are some other controls to handle loading new data into the engine and the modes of the DSP48. Velocity Profile Engine CIC Fitler Straight Line Interpolator

A4984 Controller

A4984 cntl top.jpg

This block of code interfaces with the A4984 motor controller IC. There are a couple of timing restrictions on the inputs to these chips which must be taken into consideration. The pulse duration must be at least 1us high and 1us low. Another restriction is on the control pins. These cannot toggle during a pulse. They need to be stable at least 0.2us before a pulse goes high.

A4984 internals.jpg

I designed this code for minimal area. The four channels are time division multiplexed in the code. The output rate will be 1/4th of the clock rate. The outputs will update at around 12 MHz. I use the MSB of a counter to generate the pulse. This counter freely rolls over and is incremented every clock. I stall the counter at 25. It will wait at this value until a new pulse is received. Then it starts counting. Approximately 0.2us after the pulse is received the MSB of the counter goes high. It will remain high for about 1us. Then it will go low. Then the counter continues to count until it hits the stall location.

The direction pin is updated instantly when a new pulse is received. This is the simplest way to interface with the A4984 chips.

real-time tasks

A list of real-time tasks currently done by non-FPGA electronics does that perhaps could be done by the FPGA:

  • coordinate X and Y movements so that diagonal lines are straight, circles are circular, etc.
  • coordinate extruder so that we get nice, consistent, water-tight plastic placement in the right places, while minimizing unwanted oozing in the gaps and corners
  • Keep track of current XYZ position relative to home position
  • Zero an axis position every time that axis hits the MIN position.
  • panic shut-off when any axis hits the MIN or MAX position unexpectedly
  • freeze shut-off when the human hits the Big Red Button ... is it possible to "resume" later?
  • "chopper" drive to limit current to steppers to below the rated maximum current the stepper can handle
  • "thermal overload protection" to limit the temperature of the power transistors to below their rated maximum.
  • translate abstract lines and splines to sequences of "step", "dir" pulses
  • translate "step", "dir" pulses to half-step patterns
    • (optional) vacillate between adjacent half-step patterns at the appropriate PWM or PPM duty cycle to implement microstepping.
  • keep hot end of extruder at the right temperature to melt plastic.
  • keep Heated Bed at the right temperature to prevent warping.
  • panic shut-off if either heater thermistor ever fails, or if either temperature spirals out-of-control, or if the temperature plummets out-of-control.

distant future possibilities this opens up

  • The non-Cartesian designs such as RepOlaRap pretty much *require* coordinated, non-linear motor positioning for every path, even straight lines. Hopefully cubic splines will be an adequate approximation.
    • Should the FPGA dynamically translate from Cartesian XYZ positions to non-Cartesian motor positions in real time?
  • Rather than completely regular PWM or almost completely random PPM, how about something in-between?[3].

position feedback

Main page: RepRapServo 1 0

A FPGA may make it easier to handle the streams of data coming back from one or more position sensors:

  • Constantly measure the current position with an encoder, and use that position to decide which direction and how strongly to nudge the motor to get it to the desired position.
  • Can a webcam to measure the actual position of the tip of the toolhead? Does that give enough accuracy to compensate for backlash, belt stretching, stepper skipping, inaccurate estimates of polar arm lengths, etc.?
  • Can a webcam to measure the current extruded plastic width or height? Then if it's too thick, speed up X and Y or slow down the extruder to compensate?