sympy
to walk through the optimization stepsimport numpy as np
from sympy import *
init_printing(use_unicode=True)
# Bokeh for interactive plots
from bokeh.plotting import figure, output_notebook, show
from bokeh.layouts import row, column
from bokeh.models import Span, PointDrawTool
output_notebook()
def plot(w=700,h=500,title='',x='X',y='f(x)'):
'''Wrapper function to ease starting a new plot.'''
p = figure(plot_width=w,plot_height=h,title=title,
toolbar_location="above",
tooltips=[("out", "$y"),("x", "$x")],
tools='pan,hover,wheel_zoom,reset')
# Define a continuous line for the x-axis
hline = Span(location=0, dimension='width', line_color='grey', line_width=1)
p.renderers.extend([hline])
# Labels
p.xaxis.axis_label = x
p.yaxis.axis_label = y
return p
def draw(p):
'''Wrapper function for generating a drawing tool on a graph'''
renderer = p.scatter(x=[], y=[], color='black', size=10)
draw_tool = PointDrawTool(renderers=[renderer], empty_value='black')
p.add_tools(draw_tool)
p.toolbar.active_tap = draw_tool
return p
Consider the following two functions:
$$ f(x) = -x^2 $$ $$ g(x) = x^2 $$
def f(x):
return -x**2
def g(x):
return x**2
Let's plot them. For what value of $x$ in the interval $[-5,5]$ does each function reach a maximum or minimum value?
x = np.arange(-5,5.1,.1) # Define the Function Domain
# Plot 1
p = plot(w=700,h=300)
p.line(x,f(x),line_width=3,alpha=.3)
# p.scatter([0],[0],size=6)
p = draw(p)
# Plot 2
p2 = plot(w=700,h=300)
p2.line(x,g(x),line_width=3,alpha=.3)
# p2.scatter([0],[0],size=6)
p2 = draw(p2)
show(column(p,p2)) # Render plot
What about the following function?
$$h(x) = -2x \cos{x} $$
def h(x):
return -2*x*np.cos(x)
Let's plot them again. This time let's try to "eyeball" where the minima and maxima of the function for $x \in [-5,5]$
x = np.arange(-5,5.1,.1) # Define the Function Domain
# Plot Maxima
p = plot(w=700,h=400,title="Where are the maxima and minima? Draw them.")
p.line(x,h(x),line_width=3,alpha=.5,color="orange")
# Generate a drawing tool to locate the extrema
p = draw(p)
show(p)# Render plot
Something interesting emerges when we look at the first derivative of the function $h(x)$.
$$h(x) = -2x \cos{x} $$
$$h'(x) = -2 \cos(x) + 2x\sin(x) $$
def dh(x):
return -2*np.cos(x) + 2*x*np.sin(x)
p = plot(w=700,h=400,title="h(x) and h'(x) for x in [-5,5]")
# Plot function
p.line(x,h(x),line_width=3,alpha=.5,color="orange",legend="h(x)")
p.line(x,dh(x),line_width=3,alpha=.4,color="steelblue",legend="h'(x)")
p = draw(p)
show(p) # render function
A critical point is any $x^* \in [a,b]$ such that either $f'(x^*) = 0$ or $f'(x^*)$ is undefined. These are the points on the function where "things happen": the function either blows up, jumps (discontinuity), or is stationary.
Stationary points occur when $f'(x^*) = 0$: at these points in the function, there is no change in the function. We are interested in critical points because local extrema occur at these points.
Extrema of a function correspond with its maximum and minimum values within some specified domain. This can be thought of as "the peaks and the valleys" of the function. The first derivative of a function tells use the slope of the tangent line. As we see above, when function reaches a maximum/minimum, slope of the tangent line goes to zero. The first derivative carries information about the potential location of the maxima and minima within a specified domain: the extrema will always be a point where the first derivative crosses the x-axis.
x = np.arange(-5,5.1,.1) # Define the Function Domain
# Define the first derivatives of f and g
df = lambda x: -2*x
dg = lambda x: 2*x
# Plot Maxima
p = plot(w=700,h=400,title="Maxima of f(x)")
p.line(x,f(x),line_width=3,alpha=.3,legend="f(x)")
p.line(x,df(x),line_width=3,alpha=.3,color="green",legend="f'(x)")
p.scatter([0],[0],size=6)
# Plot Minima
p2 = plot(w=700,h=400,title="Minima of g(x)")
p2.line(x,g(x),line_width=3,alpha=.3,legend="g(x)")
p2.line(x,dg(x),line_width=3,alpha=.3,color="green",legend="g'(x)")
p2.scatter([0],[0],size=6)
show(column(p,p2)) # Render plot
Graphically, we can clearly locate these points visually (at least for simple functions like $f(x)$ and $g(x)$), but how would we determine if a critical point is a maxima or minima without visualizing it.
Look at the above graph and see if we can't gather any intuition for how this might work?
Consider the function $w(x)$
$$ w(x) = 2x^2 + x + 1$$
What are the critical values? That is, when is the derivative (a) equal to one or (b) undefined.
$$ w'(x) = 4x + 1 $$
$$ w'(x^*) = 4x + 1 = 0 $$
$$ x^* = -.25$$
Does this point correspond with a maxima or minima? How would we figure this out?
What is the functions behavior around that point? From the above figure we might surmise from the above graphic is that what we're most interested in is the behavior of the function both just prior and just after the critical point. Specifically, the value of the slope.
With our hypotheses in hand, let's plug in values around the critical point and see their sign.
position | evaluation | sign | |
---|---|---|---|
Left | $ w'(-.5) = 4(-.5) + 1 = -1 $ | $-$ | decreasing |
Critical value | $ w'(-.25) = 4(-.25) + 1 = 0 $ | $~$ | stationary |
Right | $ w'(0) = 4(0) + 1 = 1 $ | $+$ | increasing |
$x^* = -.25$ appears to correspond with a minima.
Let's inspect this result visually
w = lambda x: 2*x**2 + x + 1
p = plot(h=300,w=700)
p.line(x,w(x),line_width=3,alpha=.4)
p.scatter([-.25],w(-.25),size=7,color="black")
show(p)
As we'll see below, we can get more precise about this definition by using higher order derivatives.
A local extremum is the largest/smallest value of a function over some interval of values in the domain of the function (i.e. over some interval on the x-axis). A global extremem is the largest/smallest point on all values of the function
Consider again $h(x)$, which is plotted above. Of it's extrema, which are global and which are local?
p = plot(w=700,h=300,title="h(x) for x in [-5,5]")
p.line(x,h(x),line_width=3,alpha=.5,color="orange",legend="h(x)")
p = draw(p)
show(p)
What about for function $t(x)$ and $v(x)$?
$$ t(x) = (x-1)(x-3)(x-4) $$
def t(x):
return (x-1)*(x-3)*(x-4)
def v(x):
return x
p = plot(w=700,h=400)
x = np.arange(1,4.1,.1) # domain
p.line(x,t(x),line_width=3,alpha=1,color="black",legend="t(x) for x in [1,4]")
x = np.arange(0,6.1,.1) # domain
p.line(x,t(x),line_width=3,alpha=.2,color="green",legend="t(x) for x in [0,6]")
p2 = plot(w=700,h=400)
x = np.arange(0,1.1,.1) # domain
p2.line(x,v(x),line_width=3,alpha=.5,color="steelblue",legend="v(x) for x in [0,1]")
p = draw(p);p2 = draw(p2)
p.legend.location = "top_left";p2.legend.location = "top_left"
show(column(p,p2))
Takeaways?
Recall the use of the sympy
module (see lecture 19). Let's leverage the module to find the higher order derivatives of function $f(y)$
y = symbols('y')
f = y*(2*y**2 + .3*y**3)
print("\nFunction f(y)")
display(f)
print("\nFirst Derivative of f(y)")
display(f.diff())
print("\nSecond Derivative of f(y)")
display(f.diff().diff())
print("\nThird Derivative of f(y)")
display(f.diff().diff().diff())
print("\nFourth Derivative of f(y)")
display(f.diff().diff().diff().diff())
print("\nFifth Derivative of f(y)")
display(f.diff().diff().diff().diff().diff())
Derivatives of any order tell us something about the shape of a function.
What do higher-order derivatives represent conceptually? Think of driving a car. The function $f(\cdot)$ describes how far we drove as a function of time. $$~\\Distance = f(time)\\~$$ The derivative tells us our speed as a function of time (i.e. how fast were we going). $$ ~\\Speed = f'(time)\\~$$ The second derivative tells us our rate of acceleration as a function of time (i.e. how hard we were pressing on the petal at any given point in the trip). $$ ~\\Acceleration = f''(time)\\~$$ The third derivative would tell us how much the car "jerked" at any moment in time (i.e. how quickly we pressed or let up on the petal). $$ ~\\Jerk = f'''(time)\\~$$ And so on...
# Functions
f = lambda time: -10*np.cos(time) + 10
df = lambda time: 10*np.sin(time)
ddf = lambda time: 10*np.cos(time)
# Domain
time = np.arange(0,(np.pi),.01)
# Plot
dim = 300,700
p = plot(h=dim[0],w=dim[1],y="Distance",x="Time")
p1 = plot(h=dim[0],w=dim[1],y="Speed",x="Time")
p2 = plot(h=dim[0],w=dim[1],y="Acceleration",x="Time")
p.line(time,f(time),line_width=2.5)
p1.line(time,df(time),line_width=2.5,color="darkred")
p2.line(time,ddf(time),line_width=2.5,color="gold")
show(column(p,p1,p2))
For any given function $f(x)$ that is continuous and differentiable within the domain $x \in [a,b]$
$$ \text{first-order derivative}: \frac{df(x)}{dx} = \lim_{h\to0}\frac{f(x-h)-f(x)}{h} = f'(x) = D_x$$
$$ \text{second-order derivative}: \frac{d^2f(x)}{dx^2} = \lim_{h\to0}\frac{f'(x-h)-f'(x)}{h} = f''(x) = D_x^2$$
$$ \vdots $$
$$ \text{nth-order derivative}: \frac{d^{n}f(x)}{dx^n} = \lim_{h\to0}\frac{f^{(n)}(x-h)-f^{(n)}(x)}{h} = f^{(n)}(x) = D_x^n$$
def graph(func,x,start=-5,end=5):
'''
Generate an interactive graph of a sympy defined function along with
it's first and second derivatives
Arguments:
- func: univariate function defined with a sympy instanitiated symbol.
- start: start of the numpy numerical array
- end: end of the numpy numerical array
Return:
- Renders an interactive graph.
'''
# Establish numerical range
num_range = np.arange(start, end + .1,.1)
# convert sympy and higher order derivatives into numpy output
f = lambdify(x, func, "numpy")
df = lambdify(x, func.diff(), "numpy")
ddf = lambdify(x, func.diff().diff(), "numpy")
# Establish Plot
p = figure(width=700,height=400,
toolbar_location="above",
tooltips=[("f_", "$y"),("x", "$x")],
tools='pan,hover,wheel_zoom,reset')
# Set Axis spans
vline = Span(location=0, dimension='height', line_color='black', line_width=.5)
hline = Span(location=0, dimension='width', line_color='black', line_width=.5)
p.renderers.extend([hline,vline])
# Plot function and first/second derivatives
p.line(num_range,f(num_range),
color="black",muted_color='black',muted_alpha=0.1,
line_width=4,alpha=.6,
legend=f"f({x.name}) = {func}")
p.line(num_range,df(num_range),
color="steelblue",muted_color='steelblue',muted_alpha=0.1,
line_width=4,alpha=.6,
legend=f"f'({x.name}) = {func.diff()}")
p.line(num_range,ddf(num_range),
color="orange",muted_color='orange',muted_alpha=0.1,
line_width=4,alpha=.6,
legend=f"f''({x.name}) = {func.diff().diff()}")
# Add draw feature
p = draw(p)
# Set plot conditions
p.legend.click_policy="mute"
p.legend.location = "top_left"
p.xaxis.axis_label = "X"
p.yaxis.axis_label = ""
show(p) # Output plot
Play with different functional forms and get a feel for their higher order derivatives.
z = symbols('z') # define symbol z
graph(-1+z*cos(z),z)
graph(z**3,z)
x = symbols('x') # Define sympy symbol
$$ f'(x^*) = 0 $$ $$ f''(x^*) < 0 $$
graph(-x**2,x)
$$ f'(x^*) = 0 $$ $$ f''(x^*) > 0 $$
graph(x**2,x)
$$ f'(x^*) = 0 $$ $$ f''(x^*) = 0 $$
graph(x**3,x)
Inflection points are locations where the graph turns from being concave to convex or vice versa. Such points are also known as saddle points and are not extrema. In general, we don't care about inflection points: we care about extrema!
<font color = 'red'> However, we have to make sure that inflection points are actually inflection points </font>. That is, the function goes from concave to convex or vice versa. It could actually be a minima/maxima. We can uncover if this is the case by taking the higher order derivatives.
$$ f'''(x^*) > 0~~\text{or}~~f'''(x^*) < 0~~\text{then}~~x^* =~\text{inflection point} $$
For function $f(x) = x^3$
$$ f'(x^*) = 0~\text{for}~x^* = 0$$
$$ f''(0) = 0$$
$$ f'''(x) = 6$$
(x**3).diff().diff().diff()
So $x^* =0$ for $f(x) = x^3$ is a inflection point.
However, consider the following function...
func = x**4
graph(func,x)
$$ f'(x^*) = 4x^3 = 0 $$ $$ x^3 = 0 $$ $$ x^* = 0 $$
$$ f''(0) = 12(0)^2 = 0 $$
$$ f'''(0) = 24(0) = 0 $$
$$ f^{(4)}(0) = 24 $$
$x^* = 0$ is actually a minima of $f(x)$.
Once you've located the critical points of a function, found the stationary points and determined if they are local minima or maxima, you need to make a determination if they are global minima or maxima. That is, across the entire specified domain of the function, is this in fact the largest (or smallest) point.
We can do this by actually plugging the boundary values into $f(x)$ and seeing if their value is larger/smaller than the interior extrema we located.
Again with function $f(x) = x^4$, we located a local minima at $x^* = 0$, is this in fact a local minima?
From the graph above, we can see the domain $[-5,5]$. Let's plug those values into $f(x)$.
$$ f(-5) = (-5)^4 = 625 $$
$$ f(5) = (5)^4 = 625 $$
Neither value is less than our minima
$$ f(0) = 0 $$
So $x^*$ is in fact a global minima for $f(x)$
Putting all the above together, here are the steps for minimizing or maximizing a univariate function (i.e. finding the global minima/maxima). These are the steps outlined in the reading (see pg. 168):
And with that, choose the value you care about the most (i.e. are you maximizing or minimizing?)
sympy
¶$$ f(x) = 0.3 x^{3} - x^{2} + 0.4 x + 2 $$
$$ \arg max_x f(x) $$
where $x \in [-4,4] $
# Our function!
x = symbols('x')
f = .3*x**3 -x**2 + .4*x +2
f
# Find the first derivative
df = f.diff()
df
# Solve for f'(x) = 0
Eq(df,0)
crit_points = solveset(Eq(df,0),domain=Interval(-4,4))
crit_points = list(crit_points)
crit_points
# Find the second derivative
ddf = df.diff()
ddf
# For x* in crit_points, plug each x* in to f''(x)
ddf.evalf(subs={x: 0}) # We can plut in values for x like this
Let's now plug in our critical points.
store = []
for x_ in crit_points:
deriv2_eval = ddf.evalf(subs={x: x_})
# Determine type
cp_type = None
if deriv2_eval > 0:
cp_type ="local minima"
elif deriv2_eval < 0:
cp_type ="local maxima"
else:
cp_type ="pot. inflection point"
# Print each type
print(f"f''({round(x_,2)}) = {round(deriv2_eval,2)} (x* = {round(x_,2)} is a {cp_type})")
# save output
store.append([x_,deriv2_eval])
store = np.array(store) # convert to numpy matrix
# Find the values for each interior extremum
print("Interior Extrema:")
for row in store:
crit_val = row[0]
ev = f.evalf(subs={x: crit_val}) # plug back into x
print(f'f({round(crit_val,2)}) = {round(ev,2)}')
# Find the values of the boundaries
boundaries = [-4,4]
print('Boundary Values:')
for b in boundaries:
ev = f.evalf(subs={x: b})
print(f'f({round(b,2)}) = {round(ev,2)}')
Are either of our stationary points global minima or maxima?
graph(.3*x**3 -x**2 + .4*x +2 ,x,start=-4,end=4)
Find all extrema for the following functions along the specified domain. State whether each extremum is a minimum or maximum. Of the minima and maxima, state which are local and which are global.
optimize
¶Python has a range of optimization algorithms, some of which we will touch in the forthcoming lectures. For now, I just want to put scipy
on your radar.
from scipy import optimize
# Set our objective function
def f(x):
return .3*x**3 -x**2 + .4*x +2
result = optimize.minimize_scalar(f,bounds=(-4,4),method='bounded')
result