Discussion thread | https://lists.apache.org/thread/670qw80wwfflgv3djqg4304xqy9y8l19 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Vote thread | https://lists.apache.org/thread/2lqq021vyc98w3yly678s8lpv0o8vpz5 | ||||||||
JIRA |
| ||||||||
Release | 0.6.0 |
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
- Start the Runner and Scheduler, and use CLI commands to submit verification plan files to simulate the following anomalies and verify Celeborn's stability.
- Kill master process
- Kill worker process
- Worker directory not writable
- Worker disk IO hang
- High CPU load
- Master node metadata corruption
- Mock shuffle process and support implementation of other corner cases to test each stage of shuffle and enrich workloads.
- Provide helm chart to support deployment of chaos testing framework on Kubernetes.
Rejected Alternatives
The chaos testing framework has no other rejected alternatives.