I'm not sure I can answer your question, but here are my thoughts:
1. There isn't enough discrimination in the measurement system to be of much use to detect trends especially when you aren't using the entire scale. You essentially have nominal data (e.g., 4 or 5).
2. Change the definitions of the categories. For example, use the current 3 as 1 and the current 5 as 5 and add 3 more categories between those.
3. You might try increasing the number of panelists and using the average of their scores (after you assess the between panelist variation).
4. Find another continuous measure that correlates with the odor measurement.
"All models are wrong, some are useful" G.E.P. Box