Rectifying Shortcut Behaviors in Preference-Based Reward Learning