errors generated when using librados storage type

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

errors generated when using librados storage type

nmtadam
I'm encountering errors when I try to use librados as a storage type.

Error generated by the controller in system.log:

org.codehaus.jackson.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')

Error generated by the driver in libs.log:

java.lang.UnsupportedClassVersionError: com/ceph/rados/RadosException : Unsupported major.minor version 51.0

I'm using java version "1.6.0_45"

I was wondering if anyone else has encountered these issues or can provide some advice on how to fix this.

Thanks,
Adam
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

Niklas Goerke
I pushed an updated version of the librados adapter, now using a java-rados build that I created with java 1.6. It might be that the previous build of java-rados was created with java 1.7… I'm sorry about that.

Please report back if the fix worked out
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

Niklas Goerke
Forget to mention: I pushed it to my git and created a pull request. You might have to take it from my github account while waiting for the pull request to be accepted:

https://github.com/Niklas974/cosbench/tree/librados
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

nmtadam
Niklas,

Using Java 6 was my problem. I had installed Java 7 on cluster nodes, but I forgot to set Java 7 as the default version of Java. Once this was complete I was able to use librados as a storage type. I appreciate your help and thanks for writing the adaptor.

Adam
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

jrgruher
Hey Adam, Niklas has been kind enough to provide a sample librados workload file in another thread, but it always helps to have a few extra examples.  Would you mind posting one of the workload files that worked successfully for you?  I've been having a little trouble constructing them on my own.  Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

nmtadam
In reply to this post by nmtadam
Here you go jrgruher, good luck getting going. The endpoint is the ip address of one of my monitors.

<?xml version="1.0" encoding="UTF-8" ?>
<workload name="librados-sample" description="sample benchmark for librados">

<storage type="librados" config="accesskey=admin;secretkey=key-in-ceph.client.admin.keyring;endpoint="/>

  <workflow>

    <workstage name="init">
      <work type="init" workers="1" config="containers=r(1,2)" />
    </workstage>

    <workstage name="prepare">
      <work type="prepare" workers="1" config="containers=r(1,2);objects=r(1,10);sizes=c(10240)KB" />
    </workstage>

    <workstage name="main">
      <work name="main" workers="8" runtime="60">
        <operation type="read" ratio="50" config="containers=u(1,2);objects=u(1,10)" />
        <operation type="write" ratio="50" config="containers=u(1,2);objects=u(11,20);sizes=c(10240)KB" />
      </work>
    </workstage>

    <workstage name="cleanup">
      <work type="cleanup" workers="1" config="containers=r(1,2);objects=r(1,20)" />
    </workstage>

    <workstage name="dispose">
      <work type="dispose" workers="1" config="containers=r(1,2)" />
    </workstage>

  </workflow>

</workload>
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

Niklas Goerke
Please be careful with using cosbench for creating your containers (= pools). The librados protocol which lies underneath does NOT allow for setting the amount of placementgroups for each pool. The default which will be used is 8 which is definitely NOT suitable for most systems.

Please read [1] and [2] for further details. I strongly recommend creating the pool using the command documented in [1] with respect to the suggested amount of PGs as documented in [2].
Results from initialising the pool with cosbench / librados are NOT comparable to a properly set up pool!



[1] http://ceph.com/docs/master/rados/operations/pools/#create-a-pool
[2] http://ceph.com/docs/master/rados/operations/placement-groups/
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

nmtadam
Thanks for the tip Niklas. I hadn't really thought about performance tuning at this point, but was going to have to do so soon enough. Did you use the async read/write calls in librados for the adaptor? I've seen fairly large performance discrepancies between sync/async usage of librados.
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

Niklas Goerke
Hi

I used rados-java by Wido [1]. Which uses synced read/write (at least that is how I understand jna). I think you should be able get the same performance as with async i/o by just increasing the amount of threads.

[1] https://github.com/Niklas974/rados-java
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

jrgruher
In reply to this post by Niklas Goerke
Niklas Goerke wrote
Please be careful with using cosbench for creating your containers (= pools). The librados protocol which lies underneath does NOT allow for setting the amount of placementgroups for each pool. The default which will be used is 8 which is definitely NOT suitable for most systems.
You can configure a default number of placements groups (PGs and PGPs) for new pools in ceph.conf.  If that value was set properly I wonder if it would then be used for pools created through librados?  It seems like it should.
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

jrgruher
I have confirmed if you set the default PG and PGP counts in ceph.conf then those values are used for pools created by COSBench.  Note it takes Ceph a while to handle all the new PGs so it probably makes sense to add a suitably long delay workstage after the init workstage.

In ceph.conf:
osd pool default pg num = 2048
osd pool default pgp num = 2048

Resulting pools created by COSBench:
pool 71 'mycontainers_1' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 487 owner 0
pool 72 'mycontainers_2' rep size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 488 owner 0

Also, while this worked fine for an init of two pools, an init of 32 pools failed after the 14th pool and my worker was crashed/halted.  I had to go restart it and the run failed.  So there may be some gotchas with large/complex init stages.
Reply | Threaded
Open this post in threaded view
|

Re: errors generated when using librados storage type

ywang19
Administrator
A bit interesting to see crashes at init stage, could you like to show the log files?


-yaguang