[autotest] limit repair failed count to the same host and hqe

The limit is added so we won't repeatedly repair a host for a job created
from AFE. The code path has a bug that will set the host in repair failed
status even for jobs created with meta_host, and the host was repair the
first time.

This CL limits the count of repair job to the ones with same host and hqe.
Thus, a host can be tried to be repaired if an hqe failed in multiple hosts.

DEPLOY=scheduler
BUG=chromium:392496,chromium:426905
TEST=local
set max_repair_limit in global config to 0, raise an exception in reset to
force reset to fail.

test frontend job:
Create a job from AFE with a given host. Confirm that the dut goes into repair
failed status and no repair job queued.

test suite job:
create a suite job
When max_repair_limit is set to 0, confirm the duts goes into repair failed
status and no repair job queued.
Wehn max_repair_limit is set to 2, confirm that repair job was created after
reset failure.

Change-Id: Icf737f7ff90a96edd6f08b5d79f431b66313d242
Reviewed-on: https://chromium-review.googlesource.com/225442
Reviewed-by: Dan Shi <dshi@chromium.org>
Commit-Queue: Dan Shi <dshi@chromium.org>
Tested-by: Dan Shi <dshi@chromium.org>
1 file changed