This is an automated email from the ASF dual-hosted git repository. ycai pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-sidecar.git
The following commit(s) were added to refs/heads/trunk by this push: new bc219cb Add the CONTRIBUTING.md doc with guidelines and best practices bc219cb is described below commit bc219cbf75bdbfdc7a95b3160ef17332c9274b44 Author: Francisco Guerrero <francisco.guerr...@apple.com> AuthorDate: Tue Mar 29 12:09:06 2022 -0700 Add the CONTRIBUTING.md doc with guidelines and best practices The document delves into the contribution guidelines, source code best practices and source code style. --- CONTRIBUTING.md | 329 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 9 +- 2 files changed, 336 insertions(+), 2 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..679eb6b --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,329 @@ +# Contributing to Apache Cassandra Sidecar + +We warmly welcome and appreciate contributions from the community. + +## Table of Contents + +* [How to Contribute](#how-to-contribute) + * [Discuss](#discuss) + * [Create a Ticket](#ticket) +* [Source Code Best Practices](#best-practices) + * [Asynchronous Programming](#async-programming) + * [Thread Pool Model](#thread-pools) + * [One-shot Timers and Periodic Timers](#timers) + * [RestEasy Integration](#resteasy) + * [Dependency Injection](#guice) + * [Handler Chaining](#chaining-handlers) + * [Asynchronous Handlers](#async-handlers) + * [Future Composition](#future-composition) + * [Failure Handling](#failure-handling) +* [Source Code Style](#source-code-style) + +## <a name="how-to-contribute"></a>How to Contribute + +### <a name="discuss"></a>Discuss + +Find an existing issue on Jira, or start a new discussing in the Apache Cassandra mailing list. + +### <a name="ticket"></a>Create a Ticket + +Before creating a ticket, please take the time to [research first](#discuss). + +If you are creating a new Jira after a discussion on the Apache Cassandra mailing list, please provide a +self-sufficient description in the ticket. This is certainly extra work, but Jira is the place where we capture +important design discussions and decisions that can often be referenced after the fix version, to understand +the origin of a feature, understand design decisions, and so on. + +When ready create a Jira ticket. + +## <a name="best-practices"></a>Source Code Best Practices + +The Apache Cassandra Sidecar project uses the [vertx](https://vertx.io) toolkit. It uses the asynchronous and +reactive programming paradigm. This allows for Sidecar to scale up as workloads grow, as well as resiliency when +failures arise. + +<a name="traditional-app-server"></a> +In a traditional application server, a pool of threads is used to receive request. Each request is handled by a +distinct thread. This model is well suited for small-medium workloads. As workloads grow, applications need to scale +horizontally because of hardware limits on the number of threads that can be created. + +### <a name="async-programming"></a>Asynchronous Programming + +In the Asynchronous Programming model, the same hardware is able to handle more requests. This model uses fewer threads +to process incoming connections. When blocking I/O operations occur, the thread moves on to the next task (handling +a different request for example), and then once the I/O operation has completed the thread will come back to the +initial task. + +Vertx multiplexes concurrent workloads using event loops. This allows taking advantage of the existing hardware more +effectively to handle more requests. + +Any blocking I/O or CPU intensive processing needs to be handled outside the event loop threads. NEVER BLOCK THE +EVENT LOOP THREAD! + +### <a name="thread-pools"></a>Thread Pool Model + +Vertx uses different thread pools for internal processing. By default, vertx will use the event loop thread pool, +the worker thread pool, and the internal worker thread pool. + +#### The event loop thread pool (i.e. vert.x-eventloop-thread-1) + +The event loop thread pool threads handle events for processing. When the event comes in, it is dispatched to a +handler. It is expected that the processing will be complete quickly per event. If it doesn't you will see log +entries warning you that the event loop thread has been blocked. Vertx provisions a thread to detect blocked threads +in the execution. + +``` +Thread vertx-eventloop-thread-3 has been blocked for 20458 ms +``` + +If you see log entries like the one above, make sure to audit your code to understand your code and what is +causing your code to block the event pool thread. Consider moving the blocking I/O operations or CPU-intensive +operations to a worker thread pool. + +#### The worker thread pool (i.e. vert.x-worker-thread-1) + +This thread pool is dedicated to handling blocking I/O operations or CPU-intensive operations that might block +event loop threads. By default, the thread pool has 20 threads. The number of worker threads can be configured +when configuring the `DeploymentOptions` for vertx. + +#### The internal worker thread pool (i.e. vert.x-internal-worker-thread-1) + +An internal worker thread pool used by vertx for internal operations. Similarly to the worker thread pool, the +internal worker thread pool has 20 threads by default. Do not use this thread pool directly, use the worker +thread pool instead. + +### <a href="timers"></a>One-shot Timers and Periodic Timers + +Use vertx APIs to set one-shot timers and periodic timers. If you need to execute a one-time operation in the future, +or if you need to run periodic operations within vertx, an API is available. These timers utilize vertx provisioned +threads that are managed internal by vertx. For example + +```java +logger.debug("Retrying streaming after {} millis", millis); +vertx.setTimer(millis, t -> acquireAndSend(context, filename, fileLength, range, startTime)); +``` + +### <a href="resteasy"></a>RestEasy Integration + +The Apache Cassandra Sidecar project uses vertx-resteasy, and is the preferred method to provision endpoints. The +exception to the rule is when you require access to low-level APIs that are not available through RestEasy. Here's +an example of a simple RestEasy handler. + +```java +@Path("/api/v1/time-skew") +public class TimeSkewHandler { + private final TimeSkewInfo info; + + @Inject + public TimeSkewHandler(TimeSkewInfo info) { + info = info; + } + + @GET + public TimeSkewResponse getTimeSkewResponse() { + return new TimeSkewResponse(System.currentTimeMillis(), info.allowableSkewInMinutes());; + } +} +``` + +### <a href="guice"></a>Dependency Injection + +The Apache Cassandra Sidecar project uses Guice for handling the object interdependency. When a class encapsulates +some functionality that is then used by a different class in the system, we use Guice for dependency injection. + +When a different implementation can be provided in the future, prefer using an interface and a default implementation +that can be later switched transparently without affecting the classes that depend on it. + +Prefer creating concrete classes over utility methods. The concrete classes can be managed as a +[Singleton](https://en.wikipedia.org/wiki/Singleton_pattern) if they do not have external dependencies that might +change their behavior based on the configuration. + +Here's an example of a class being managed by Guice. + +```java +/** + * Verifies the checksum of a file + */ +public interface ChecksumVerifier { + Future<Boolean> verify(String expectedhash, String filename); +} +``` + +Let's say we support `MD5` for checksum verification, our implementation can look like this: + +```java +public class MD5ChecksumVerifier implements ChecksumVerifier { + public Future<Boolean> verify(String expectedhash, String filename) { + return fs.open(filename, new OpenOptions()) + .compose(this::calculateMD5) + .compose(computedChecksum -> { + if (!expectedHash.equals(computedChecksum)) + return Future.failedFuture(); + return Future.succeededFuture(true); + }); + } + } +``` + +A new implementation of `ChecksumVerifier` that uses a different hashing algorithm can be injected later. + +### <a name="chaining-handlers"></a>Handler Chaining + +Vertx allows you to chain handlers. Each handler in the chain will process a small unit of operation. This becomes +useful when you want to reuse code in different routes. For example the `FileStreamHandler` which is used by multiple +routes. + +Here's an example of a route that uses multiple handlers. + + +```java +router.route().method(HttpMethod.GET) + .path("/api/v1/keyspace/:keyspace/table/:table/snapshots/:snapshot/component/:component") + .handler(validateQualifiedTableHandler) + .handler(streamSSTableComponentHandler) + .handler(fileStreamHandler); +``` + +This route chains three handlers. The first handler validates the keyspace and table name provided as part of the +path parameters. The second handler determines the path on disk of the component that is going to be streamed. Lastly, +the file stream handler will stream the file from the previous handler back to the client. + +Note how the validation handler can be reused by other routes that also need keyspace and table name validation; +and the filestream handler can be reused by other routes that need to perform file streaming. + +Let's take a look at these handlers a little more in detail. + +```java +public class ValidateQualifiedTableHandler implements Handler<RoutingContext> { + public void handle(RoutingContext ctx) { + String ks = ctx.pathParam("keyspace"); + String tn = ctx.pathParam("table"); + if (!isValidKeyspaceName(ks)) ctx.fail("Invalid keyspace"); + else if (!isValidTable(tn)) ctx.fail("Invalid table"); + else ctx.next(); + } +} +``` + +```java +public class StreamSSTableComponentHandler implements Handler<RoutingContext> { + public void handle(RoutingContext ctx) { + String ks = ctx.pathParam("keyspace"); + String tn = ctx.pathParam("table"); + … + pathBuilder.build(host, ks, tn, snapshot, component) + .onSuccess(p -> ctx.put("filename", p).next()) + .onFailure(ctx::fail); + } +} +``` + +```java +public class FileStreamHandler implements Handler<RoutingContext> { + public void handle(RoutingContext context) { + final String localFile = context.get("filename"); + FileSystem fs = context.vertx().fileSystem(); + fs.exists(localFile) + .compose(exists -> ensureValidFile(fs, localFile, exists)) + .compose(fileProps -> fileStreamer.stream(new HttpResponse(context.response()), localFile, fileProps.size(), context.request().getHeader(HttpHeaderNames.RANGE))) + .onFailure(context::fail); + } +} +``` + +### <a name="async-handlers"></a>Asynchronous Handlers + +When [RestEasy](#resteasy) is not a suitable approach, use asynchronous handlers. Vertx has support for both +asynchronous and blocking handlers. When using a blocking handler, you will take up a worker thread for the entire +execution of the handler. This is similar to what the [traditional application servers](#traditional-app-server) +do with their threading model, which doesn't take full advantage of the asynchronous and reactive benefits that vertx +offers. + +Blocking handler: + +```java +router.get("/blockingHandler") + .blockingHandler(listSnapshotFiles); +``` + +Asynchronous handlers: + +```java +router.route().method(HttpMethod.GET) + .path("/asyncHandler") + .handler(streamSSTableComponentHandler) + .handler(fileStreamHandler); +``` + +### <a name="future-composition"></a>Future Composition + +Future composition is a feature that allows you to chain multiple futures. When the current future succeeds, then +it applies to the function down the chain. When one of the futures fails, the composition fails. + +```java +FileSystem fs = vertx.fileSystem(); +Future<Void> future = fs.createFile("/foo") + .compose(v -> fs.writeFile("/foo", Buffer.buffer())) + .compose(v -> fs.move("/foo", "/bar")); +``` + +If one of the future fails above, the chain is stopped and a failed future results from the chain. If all the futures +succeed, then the chain succeeds. + +### <a name="failure-handling"></a>Failure Handling + +If you need to provide feedback to the client about a request, use `HttpException` with the appropriate status code +and a message that describes the issue and helps the client fix the issue. + +For example: + +```java +throw new HttpException(BAD_REQUEST.code(), "Computed MD5 checksum does not match expected"); +``` + +The exception can be added directly to the `RequestContext` inside the handler: + +```java +context.fail(new HttpException(TOO_MANY_REQUESTS.code(), "Retry exhausted")); +``` + +Be careful what you add in the response payload of the `HttpException`. Bad actors can use information from these +responses to try to compromise the system. + +A look at the new `SidecarFailureHandler` class: + +```java +public class SidecarFailureHandler implements Handler<RoutingContext> { + @Override + public void handle(RoutingContext ctx) { + Throwable t = ctx.failure(); + if (t instanceOf HttpException) handleHttpException(ctx, (HttpException) t); + else if (ctx.statusCode() == REQUEST_TIMEOUT.code()) + handleRequestTimeout(ctx); + else ctx.next(); // handled by vertx + } + + private void handleHttpException(RoutingContext ctx, HttpException e) { + JsonObject payload = new JsonObject() + .put("status", "Fail") + .put("message", e.getPayload()); + writeResponse(ctx, e.getStatusCode(), payload); + } + … +} +``` + +## <a name="source-code-style"></a>Source Code Style + +The project provides an +[IntelliJ IDEA code formatting configuration](https://github.com/apache/cassandra-sidecar/blob/trunk/ide/idea/codeStyleSettings.xml) +that defines the source file coding standards. + +To import the formatting configuration run the following gradle task: + +```shell +./gradlew copyCodeStyle +``` + +This will install the style settings into the `.idea` directory located at the root of the project directory. +You can then use the provided configuration that adhere to the project source file coding standards. \ No newline at end of file diff --git a/README.md b/README.md index f7643f6..2bdf4ef 100644 --- a/README.md +++ b/README.md @@ -34,8 +34,8 @@ The easiest way to get started locally is by installing [Minikube](https://kuber Start minikube with a command similar to the following. Use a netmask appropriate for your local network, and allow minikube to use as much RAM as you can afford to: minikube start --insecure-registry "192.168.0.0/16" --addons registry --memory 8G --cpus=4 - -This will create a MiniKub cluster using the default driver. On OSX, this is hyperkit. + +This will create a MiniKube cluster using the default driver. On OSX, this is hyperkit. Enabling the tunnel is required in certain environments for tests to connect to the instances. @@ -72,6 +72,11 @@ CircleCI Testing You will need to use the "Add Projects" function of CircleCI to set up CircleCI on your fork. When promoted to create a branch, do not replace the CircleCI config, choose the option to do it manually. CircleCI will pick up the in project configuration. +Contributing +------------ + +We warmly welcome and appreciate contributions from the community. Please see [CONTRIBUTING.md](CONTRIBUTING.md) +if you wish to submit pull requests. Wondering where to go from here? -------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org