My initial reaction is why are you testing at 4 levels if you are doing screening?  The objective of screening designs for the most part is to examine a large number of factors in an efficient number of treatments.  This is done using the principles of scarcity, hierarchy and heredity of effects.  A 4-level factor allows for estimation of effects of hierarchy beyond what screening designs are intended to do.  Can you pick the extremes of the 4 levels (which grades do you think will be the most different?) and test that factor at 2-levels?
					
				
			
			
				
	"All models are wrong, some are useful" G.E.P. Box