public class StepFactory extends Object
Example usage, create an interactive Hive job flow with debugging enabled:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | AWSCredentials credentials = new BasicAWSCredentials(accessKey, secretKey); AmazonElasticMapReduce emr = new AmazonElasticMapReduceClient(credentials); StepFactory stepFactory = new StepFactory(); StepConfig enableDebugging = new StepConfig() .withName( "Enable Debugging" ) .withActionOnFailure( "TERMINATE_JOB_FLOW" ) .withHadoopJarStep(stepFactory.newEnableDebuggingStep()); StepConfig installHive = new StepConfig() .withName( "Install Hive" ) .withActionOnFailure( "TERMINATE_JOB_FLOW" ) .withHadoopJarStep(stepFactory.newInstallHiveStep()); RunJobFlowRequest request = new RunJobFlowRequest() .withName( "Hive Interactive" ) .withSteps(enableDebugging, installHive) .withInstances( new JobFlowInstancesConfig() .withEc2KeyName( "keypair" ) .withHadoopVersion( "0.20" ) .withInstanceCount( 5 ) .withKeepJobFlowAliveWhenNoSteps( true ) .withMasterInstanceType( "m1.small" ) .withSlaveInstanceType( "m1.small" )); RunJobFlowResult result = emr.runJobFlow(request); |
Modifier and Type | Class and Description |
---|---|
static class |
StepFactory.HiveVersion
The available Hive versions.
|
Constructor and Description |
---|
StepFactory()
Creates a new StepFactory using the default Elastic Map Reduce bucket
(us-east-1.elasticmapreduce) for the default (us-east-1) region.
|
StepFactory(String bucket)
Creates a new StepFactory using the specified Amazon S3 bucket to load
resources.
|
Modifier and Type | Method and Description |
---|---|
HadoopJarStepConfig |
newEnableDebuggingStep()
When ran as the first step in your job flow, enables the Hadoop debugging
UI in the AWS Management Console.
|
HadoopJarStepConfig |
newInstallHiveStep()
Step that installs the default version of Hive on your job flow.
|
HadoopJarStepConfig |
newInstallHiveStep(StepFactory.HiveVersion... hiveVersions)
Step that installs the specified versions of Hive on your job flow.
|
HadoopJarStepConfig |
newInstallHiveStep(String... hiveVersions)
Step that installs the specified versions of Hive on your job flow.
|
HadoopJarStepConfig |
newInstallPigStep()
Step that installs the default version of Pig on your job flow.
|
HadoopJarStepConfig |
newInstallPigStep(String... pigVersions)
Step that installs Pig on your job flow.
|
HadoopJarStepConfig |
newRunHiveScriptStep(String script,
String... args)
Step that runs a Hive script on your job flow using the default Hive version.
|
HadoopJarStepConfig |
newRunHiveScriptStepVersioned(String script,
String hiveVersion,
String... scriptArgs)
Step that runs a Hive script on your job flow using the specified Hive version.
|
HadoopJarStepConfig |
newRunPigScriptStep(String script,
String... scriptArgs)
Step that runs a Pig script on your job flow using the default Pig version.
|
HadoopJarStepConfig |
newRunPigScriptStep(String script,
String pigVersion,
String... scriptArgs)
Step that runs a Pig script on your job flow using the specified Pig version.
|
HadoopJarStepConfig |
newScriptRunnerStep(String script,
String... args)
Runs a specified script on the master node of your cluster.
|
public StepFactory()
public StepFactory(String bucket)
The official bucket format is "<region>.elasticmapreduce", so if you're using the us-east-1 region, you should use the bucket "us-east-1.elasticmapreduce".
bucket
- The Amazon S3 bucket from which to load resources.public HadoopJarStepConfig newScriptRunnerStep(String script, String... args)
script
- The script to run.args
- Arguments that get passed to the script.public HadoopJarStepConfig newEnableDebuggingStep()
public HadoopJarStepConfig newInstallHiveStep(StepFactory.HiveVersion... hiveVersions)
hiveVersions
- the versions of Hive to installpublic HadoopJarStepConfig newInstallHiveStep(String... hiveVersions)
hiveVersions
- the versions of Hive to installpublic HadoopJarStepConfig newInstallHiveStep()
public HadoopJarStepConfig newRunHiveScriptStepVersioned(String script, String hiveVersion, String... scriptArgs)
script
- The script to run.hiveVersion
- The Hive version to use.scriptArgs
- Arguments that get passed to the script.public HadoopJarStepConfig newRunHiveScriptStep(String script, String... args)
script
- The script to run.args
- Arguments that get passed to the script.public HadoopJarStepConfig newInstallPigStep()
public HadoopJarStepConfig newInstallPigStep(String... pigVersions)
pigVersions
- the versions of Pig to install.public HadoopJarStepConfig newRunPigScriptStep(String script, String pigVersion, String... scriptArgs)
script
- The script to run.pigVersion
- The Pig version to use.scriptArgs
- Arguments that get passed to the script.public HadoopJarStepConfig newRunPigScriptStep(String script, String... scriptArgs)
script
- The script to run.scriptArgs
- Arguments that get passed to the script.Copyright © 2013 Amazon Web Services, Inc. All Rights Reserved.