S3 md5 hash fails

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

S3 md5 hash fails

mnabi95
Hello,

I've been getting md5 errors where the hashes do not match. See the error stack below.

com.amazonaws.AmazonClientException: Unable to verify integrity of data download.  Client calculated content hash didn't match hash calculated by Amazon S3.  The data may be corrupt. at com.amazonaws.services.s3.internal.DigestValidationInputStream.validateMD5Digest(DigestValidationInputStream.java:79) at com.amazonaws.services.s3.internal.DigestValidationInputStream.read(DigestValidationInputStream.java:61) at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:72)
        at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:155)
        at com.amazonaws.services.s3.model.S3ObjectInputStream.read(S3ObjectInputStream.java:147)
        at com.intel.cosbench.driver.operator.Reader.copyLarge(Reader.java:120)
        at com.intel.cosbench.driver.operator.Reader.doRead(Reader.java:92)
        at com.intel.cosbench.driver.operator.Reader.operate(Reader.java:69)
        at com.intel.cosbench.driver.operator.AbstractOperator.operate(AbstractOperator.java:76)
        at com.intel.cosbench.driver.agent.WorkAgent.performOperation(WorkAgent.java:197)
        at com.intel.cosbench.driver.agent.WorkAgent.doWork(WorkAgent.java:177)
        at com.intel.cosbench.driver.agent.WorkAgent.execute(WorkAgent.java:134)
        at com.intel.cosbench.driver.agent.AbstractAgent.call(AbstractAgent.java:44)
        at com.intel.cosbench.driver.agent.AbstractAgent.call(AbstractAgent.java:1)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
        at java.lang.Thread.run(Thread.java:748)


I have tried on a Scality S3 server and a CEPH cluster and the error was consistent.
I saw this thread on the github here: https://github.com/intel-cloud/cosbench/issues/320
and the issue seems to be same as mine. I also noticed the more workers I have the lower my succes ratio, with a 100% success ratio for one worker.

I turned off the md5 checks and got a 100% success ratio.
My question is does anyone know why the checks fail?


My config file:


<?xml version="1.0" encoding="UTF-8" ?>
<workload name="16workers" description="sample benchmark for s3">
  <storage type="s3" config="accesskey=xxxxx;secretkey=xxxxx;endpoint=xxxx;path_style_access=true"/>

  <workflow>
    <workstage name="init">
      <work type="init" workers="1" config="cprefix=s3testqwer;containers=r(1,20)" />
    </workstage>

    <workstage name="prepare">
      <work type="prepare" workers="16" config="cprefix=s3testqwer;containers=r(1,20);objects=r(1,100);sizes=c(64)MB" />
    </workstage>

    <workstage name="main">
      <work name="main" workers="16" runtime="30">
        <operation type="read" ratio="80" config="cprefix=s3testqwer;containers=u(1,20);objects=u(1,100)" />
        <operation type="write" ratio="20" config="cprefix=s3testqwer;containers=u(1,20);objects=u(101,150);sizes=c(64)MB" />
      </work>
    </workstage>

    <workstage name="cleanup">
      <work type="cleanup" workers="1" config="cprefix=s3testqwer;containers=r(1,20);objects=r(1,150)" />
    </workstage>

    <workstage name="dispose">
      <work type="dispose" workers="1" config="cprefix=s3testqwer;containers=r(1,20)" />
    </workstage>

  </workflow>

</workload>