SparkHistoryServer
Kind
spark.stackable.tech
Group
v1alpha1
Version

apiVersion: spark.stackable.tech/v1alpha1
kind: SparkHistoryServer
spec object

A Spark cluster history server component. This resource is managed by the Stackable operator for Apache Spark. Find more information on how to use it in the operator documentation.


clusterConfig object

Global Spark history server configuration that applies to all roles and role groups.


listenerClass string: enum
Enum variants: cluster-internalexternal-unstableexternal-stable

This field controls which type of Service the Operator creates for this HistoryServer:

  • cluster-internal: Use a ClusterIP service

  • external-unstable: Use a NodePort service

  • external-stable: Use a LoadBalancer service

This is a temporary solution with the goal to keep yaml manifests forward compatible. In the future, this setting will control which ListenerClass https://docs.stackable.tech/home/stable/listener-operator/listenerclass.html will be used to expose the service, and ListenerClass names will stay the same, allowing for a non-breaking change.

image object required

Specify which image to use, the easiest way is to only configure the productVersion. You can also configure a custom image registry to pull from, as well as completely custom images.

Consult the Product image selection documentation for details.


custom string

Overwrite the docker image. Specify the full docker image name, e.g. docker.stackable.tech/stackable/superset:1.4.1-stackable2.1.0

productVersion string

Version of the product, e.g. 1.4.1.

pullPolicy string: enum
Enum variants: IfNotPresentAlwaysNever

Pull policy used when pulling the image.

pullSecrets []object

Image pull secrets to pull images from a private registry.


name string required

Name of the referent. This field is effectively required, but due to backwards compatibility is allowed to be empty. Instances of this type with an empty value here are almost certainly wrong. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names

repo string

Name of the docker repo, e.g. docker.stackable.tech/stackable

stackableVersion string

Stackable version of the product, e.g. 23.4, 23.4.1 or 0.0.0-dev. If not specified, the operator will use its own version, e.g. 23.4.1. When using a nightly operator or a pr version, it will use the nightly 0.0.0-dev image.

logFileDirectory object required

The log file directory definition used by the Spark history server.


customLogDirectory string

A custom log directory

s3 object

An S3 bucket storing the log events


bucket object required

No Description Provided.


inline object

S3 bucket specification containing the bucket name and an inlined or referenced connection specification. Learn more on the S3 concept documentation.


bucketName string required

The name of the S3 bucket.

connection object required

The definition of an S3 connection, either inline or as a reference.


inline object

S3 connection definition as a resource. Learn more on the S3 concept documentation.


accessStyle string: enum
Enum variants: PathVirtualHosted

Which access style to use. Defaults to virtual hosted-style as most of the data products out there. Have a look at the AWS documentation.

credentials object

If the S3 uses authentication you have to specify you S3 credentials. In the most cases a SecretClass providing accessKey and secretKey is sufficient.


scope object

listenerVolumes []string

The listener volume scope allows Node and Service scopes to be inferred from the applicable listeners. This must correspond to Volume names in the Pod that mount Listeners.

node boolean

The node scope is resolved to the name of the Kubernetes Node object that the Pod is running on. This will typically be the DNS name of the node.

pod boolean

The pod scope is resolved to the name of the Kubernetes Pod. This allows the secret to differentiate between StatefulSet replicas.

services []string

The service scope allows Pod objects to specify custom scopes. This should typically correspond to Service objects that the Pod participates in.

secretClass string required

SecretClass containing the LDAP bind credentials.

host string required

Host of the S3 server without any protocol or port. For example: west1.my-cloud.com.

port integer

Port the S3 server listens on. If not specified the product will determine the port to use.

tls object

Use a TLS connection. If not specified no TLS will be used.


verification object required

The verification method used to verify the certificates of the server and/or the client.


none object

Use TLS but don't verify certificates.

server object

Use TLS and a CA certificate to verify the server.


caCert object required

CA cert to verify the server.


secretClass string

Name of the SecretClass which will provide the CA certificate. Note that a SecretClass does not need to have a key but can also work with just a CA certificate, so if you got provided with a CA cert but don't have access to the key you can still use this method.

webPki object

Use TLS and the CA certificates trusted by the common web browsers to verify the server. This can be useful when you e.g. use public AWS S3 or other public available services.

reference string

No Description Provided.

reference string

No Description Provided.

prefix string required

No Description Provided.

nodes object required

A history server node role definition.


cliOverrides object

No Description Provided.

config object

No Description Provided.


affinity object

These configuration settings control Pod placement.


nodeAffinity object

Same as the spec.affinity.nodeAffinity field on the Pod, see the Kubernetes docs

nodeSelector object

Simple key-value pairs forming a nodeSelector, see the Kubernetes docs

podAffinity object

Same as the spec.affinity.podAffinity field on the Pod, see the Kubernetes docs

podAntiAffinity object

Same as the spec.affinity.podAntiAffinity field on the Pod, see the Kubernetes docs

cleaner boolean

No Description Provided.

logging object

Logging configuration, learn more in the logging concept documentation.


containers object

Log configuration per container.

enableVectorAgent boolean

Wether or not to deploy a container with the Vector log agent.

resources object

Resource usage is configured here, this includes CPU usage, memory usage and disk storage usage, if this role needs any.


cpu object

No Description Provided.


max string

The maximum amount of CPU cores that can be requested by Pods. Equivalent to the limit for Pod resource configuration. Cores are specified either as a decimal point number or as milli units. For example:1.5 will be 1.5 cores, also written as 1500m.

min string

The minimal amount of CPU cores that Pods need to run. Equivalent to the request for Pod resource configuration. Cores are specified either as a decimal point number or as milli units. For example:1.5 will be 1.5 cores, also written as 1500m.

memory object

No Description Provided.


limit string

The maximum amount of memory that should be available to the Pod. Specified as a byte Quantity, which means these suffixes are supported: E, P, T, G, M, k. You can also use the power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki. For example, the following represent roughly the same value: 128974848, 129e6, 129M, 128974848000m, 123Mi

runtimeLimits object

Additional options that can be specified.

storage object

No Description Provided.

configOverrides object

The configOverrides can be used to configure properties in product config files that are not exposed in the CRD. Read the config overrides documentation and consult the operator specific usage guide documentation for details on the available config files and settings for the specific product.

envOverrides object

envOverrides configure environment variables to be set in the Pods. It is a map from strings to strings - environment variables and the value to set. Read the environment variable overrides documentation for more information and consult the operator specific usage guide to find out about the product specific environment variables that are available.

podOverrides object

In the podOverrides property you can define a PodTemplateSpec to override any property that can be set on a Kubernetes Pod. Read the Pod overrides documentation for more information.

roleConfig object

This is a product-agnostic RoleConfig, which is sufficient for most of the products.


podDisruptionBudget object

This struct is used to configure:

  1. If PodDisruptionBudgets are created by the operator 2. The allowed number of Pods to be unavailable (maxUnavailable)

Learn more in the allowed Pod disruptions documentation.


enabled boolean

Whether a PodDisruptionBudget should be written out for this role. Disabling this enables you to specify your own - custom - one. Defaults to true.

maxUnavailable integer

The number of Pods that are allowed to be down because of voluntary disruptions. If you don't explicitly set this, the operator will use a sane default based upon knowledge about the individual product.

roleGroups object required

No Description Provided.

sparkConf object

A map of key/value strings that will be passed directly to Spark when deploying the history server.

vectorAggregatorConfigMapName string

Name of the Vector aggregator discovery ConfigMap. It must contain the key ADDRESS with the address of the Vector aggregator.