1. The Complete Guide To Sikuli X
  2. No longer maintained! Find the new Sikuli X doc at link below!
  3. 1. Basics
  4. 2. Navigator to all Classes, Methods, Functions and Settings
    1. 2.1. You may be interested in these sections
    2. 2.2. Sikuli Class Index
    3. 2.3. Sikuli Methods, Functions and Settings Index
    4. 2.4. Constants related to Mouse and Keyboard Actions
    5. 2.5. Other Constants
  5. 3. How to use this document
    1. 3.1. Abbreviations and other Conventions
    2. 3.2. Reading the Method and Function definitions
    3. 3.3. Whats new in Version X
    4. 3.4. Other places to get Information
      1. About Sikuli
      2. About Python and Jython
  6. 4. Global Functions and Features
    1. 4.1. Importing other Sikuli Scripts (reuse code and images)
    2. 4.2. Controlling Sikuli Scripts and their Behavior
      1. MoveMouseDelay
      2. DelayAfterDrag / DelayBeforeDrop
      3. setShowActions( False | True ) / SlowMotionDelay
      4. WaitScanRate / ObserveScanRate
      5. exit ([value])
    3. 4.3. Controlling Applications and their windows
      1. Class App
      2. open( [application] )
      3. focus( [application] )
      4. close( [application] )
      5. focusedWindow()
      6. window( [number] )
      7. openApp( application )
      8. switchApp( application )
      9. closeApp( application )
      10. run( command )
    4. 4.4. Interacting with the User
      1. popup( text )
      2. input( [text] )
    5. 4.5. General Settings and Access to Environment Information
      1. Accessing Settings
      2. Image Search Path
      3. setBundlePath( path-to-a-folder )
      4. getBundlePath()
      5. getOS(), getOSVersion()
      6. getClipboard()
  7. 5. Class Region
    1. 5.1. Methods of Region
    2. 5.2. Creating a Region, Setting and Getting Attributes
      1. Region( x, y, w, h ), Region( region ), Region( Rectangle )
      2. selectRegion( [text] )
      3. setX( number ), setY( number ), setW( number ), setH( …
      4. getX(), getY(), getW(), getH()
      5. setROI( x, y, w, h | rectangle ), setRect( x, y, w, h | …
      6. getROI(), getRect()
      7. getCenter()
      8. getScreen()
      9. getLastMatch(), getLastMatches()
      10. setAutoWaitTimeout( seconds ) / getAutoWaitTimeout()
    3. 5.3. Extending a Region
      1. inside()
      2. nearby( [range] )
      3. above( [range] )
      4. below( [range] )
      5. left( [range] )
      6. right( [range] )
    4. 5.4. Finding inside a Region and Waiting for a Visual Event
      1. find( PS )
      2. findAll( PS )
      3. wait( [PS], [seconds] )
      4. waitVanish( PS, [seconds] )
      5. exists( PS, [seconds] )
    5. 5.5. Observing Visual Events in a Region
      1. onAppear( PS, handler )
      2. onVanish( PS, handler )
      3. onChange( handler )
      4. observe( [seconds], [ background= False | True ] )
      5. stopObserver()
    6. 5.6. Acting on a Region
      1. click( PSMRL, [modifiers] )
      2. doubleClick( PSMRL, [modifiers] )
      3. rightClick( PSMRL, [modifiers] )
      4. highlight( [seconds] )
      5. hover( PSMRL )
      6. dragDrop( PSMRL, PSMRL, [modifiers] )
      7. drag( PSMRL )
      8. dropAt( PSMRL, [delay] )
      9. type( [PSMRL], text, [modifiers] )
      10. paste( [PSMRL], text )
    7. 5.7. Extracting Text from a Region
      1. text()
    8. 5.8. Low Level Mouse and Keyboard Actions
      1. mouseDown( button )
      2. mouseUp( [button] )
      3. mouseMove( PSRML )
      4. wheel( PSRML, WHEEL_DOWN | WHEEL_UP, steps )
      5. getMouseLocation()
      6. keyDown( key | list of keys )
      7. keyUp( [ key | list of keys ] )
    9. 5.9. Exception FindFailed
      1. setThrowException( False | True ) / getThrowException()
    10. 5.10. Grouping Method Calls (with Region:)
  8. 6. Class Screen
    1. 6.1. Methods of Screen
    2. 6.2. Screen: Setting, Getting Attributes and Information
      1. Screen(), Screen( id )
      2. getNumberScreens()
      3. getBounds()
    3. 6.3. Screen as (Default) Region
    4. 6.4. Capturing
      1. capture( [region | rectangle | text] ), capture( x, y, w, h )
      2. selectRegion()
    5. 6.5. Multi Monitor Environments
  9. 7. Class Location
    1. 7.1. Methods of Location
    2. 7.2. Creating a Location, Setting and Getting Attributes
      1. Location( x, y )
      2. getX(), getY(), setX( number ), setY( number )
      3. offset( dx, dy )
      4. above( dy )
      5. below( dy )
      6. left( dx )
      7. right( dx )
  10. 8. Class Match
    1. 8.1. Methods of Match
    2. 8.2. Creating a Match, Getting Attributes
      1. getScore()
      2. getTarget()
    3. 8.3. Iterating over Matches after findAll()
  11. 9. Class Finder
    1. 9.1. Methods of Finder
    2. 9.2. Creating a Finder
      1. Finder( path-to-imagefile )
    3. 9.3. Using a Finder
      1. find( path-to-imagefile, [ similarity ] )
      2. hasNext()
      3. next()
  12. 10. Class Pattern
    1. 10.1. Methods of Pattern
    2. 10.2. Creating a Pattern, Setting and Getting Attributes
      1. Pattern( string )
      2. similar( similarity )
      3. exact()
      4. targetOffset( dx, dy)
      5. getFilename()
      6. getTargetOffset()
  13. 11. Class VDict
    1. 11.1. Methods/Operators/Constants of VDict
    2. 11.2. Setting it up and getting Information
      1. VDict( [ vdict | dict ] )
      2. len( vdict )
      3. keys()
    3. 11.3. Managing Items
      1. vdict[ path-to-an-imagefile ] = value
      2. del vdict[ path-to-an-imagefile ]
    4. 11.4. Searching for specific Items
      1. path-to-an-imagefile [not] in vdict
      2. vdict[ path-to-an-imagefile ]
      3. get_exact( path-to-an-imagefile )
      4. get1( path-to-an-imagefile', similarity)
      5. get( path-to-an-imagefile', similarity, number )
    5. 11.5. Constants of VDict
  14. 12. Key Constants
    1. 12.1. Key Modifiers
    2. 12.2. Special Keys
  15. 13. Class Env

The Complete Guide To Sikuli X

Sikuli Version: X 1.0rc1
Authors:  Raimund Hocke,  Tsung-Hsiang (Sean) Chang

No longer maintained! Find the new Sikuli X doc at link below!

 New documentation for Sikuli X


1. Basics

This document was set up and is being maintained by  RaiMan (Raimund Hocke) with the great support by Tsung-Hsiang (Sean) Chang (one of the Sikuli developers). If you have any questions or ideas about this document, you are welcome to directly contact RaiMan using the mail address on his  personal Sikuli Launchpad page. For questions regarding the functions and features of Sikuli itself please use the  Sikuli Questions and Answers Board. For hints and links of how to get more information and help, please see Other places to get Information in this document.


2. Navigator to all Classes, Methods, Functions and Settings

RECOMMENDATION: To make this document more accessible, we put this navigator in the first place.

But before clicking around, especially the first time, please read through How to use this document. It helps you to understand the basics and how to find answers in this document.

2.1. You may be interested in these sections

2.2. Sikuli Class Index

2.3. Sikuli Methods, Functions and Settings Index

Note: A method/function/setting is noted with its (Class), if it exists with the same name in different classes, if it has to be qualified with the class name or simply to make it clear. Each section of a class also has a list of its contents.

  • [V]
  • [Y]
  • [Z]

2.4. Constants related to Mouse and Keyboard Actions

Their usage can be found in Acting on a Region and Low Level Mouse and Keyboard Actions. The constant's names should be self-explanatory.

  • Mouse related
    from Class Button: Button.LEFT - Button.MIDDLE - Button.RIGHT
    related to mouse wheel actions: WHEEL_DOWN - WHEEL_UP
  • Keyboard related
    The following predefined key constants are listed here for your convenience ( details ).

    • Key Modifiers
      KEY_ALT - KEY_CMD - KEY_CTRL - KEY_META - KEY_SHIFT - KEY_WIN
      A combination of these keys has to be set up using "+" or "|", e.g. KEY_SHIFT + KEY_ALT or KEY_SHIFT | KEY_ALT | KEY_CMD (used with action methods)

    • Special Keys
      to be used with type() or keyDown()/keyUp():
      ADD - ALT - BACKSPACE - CAPS_LOCK - DIVIDE - DOWN - END - ENTER - ESC - DELETE - F1 - F2 - F3 - F4 - F5 - F6 - F7 - F8 - F9 - F10 - F11 - F12 - F13 - F14 - F15 - HOME - INSERT- LEFT - MINUS - MULTIPLY - NUM_LOCK - NUM0 - NUM1 - NUM2 - NUM3 - NUM4 - NUM5 - NUM6 - NUM7 - NUM8 - NUM9 - PAGE_DOWN - PAGE_UP - PAUSE - PRINTSCREEN - RIGHT - SCROLL_LOCK - SEPARATOR - TAB - UP
      to be used with keyDown()/keyUp() only: CMD - CTRL - META - SHIFT - TAB - WIN
      To use these constants, write e.g. Key.ENTER. String concatenation with "+" is possible, e.g. "some text" + Key.ENTER

2.5. Other Constants

FOREVER: can be used with exists - observe - wait - waitVanish
SCREEN: a constant reference to a screen object created using Screen(0) at startup, the default/primary monitor used as the default region for methods not qualified by an instance/object/class.

[ top of Document ]


3. How to use this document

Since Sikuli Script is built as a Jython (Python for the Java platform) library, you can use any syntax of the Python language. If you are new to programming, you can still enjoy using Sikuli to automate simple repetitive tasks without learning Python. But if you would like to write more powerful and complicated scripts, you may want to dive into the  Python language (some additional Links). The preface of a chapter briefly describes a class or a group of methods. It contains general usage and hints that apply to all methods in that chapter. We recommend you to read it before using those methods.

If you are totally new with Sikuli, it would be a good idea to just read through this document sequentially. An alternative may be to jump to the chapters that you are interested in by scanning the table of contents. In any case, it's strongly recommended to carefully read through this entry chapter Basics. A way in the middle would be, going to Class Region, then to Class Match and ending at Class Screen.

For the users of previous versions (0.9.x and 0.10.x, Whats new in Sikuli X version 1.0 is a good start. After that, you can go to any places of interest using the table of contents or use the navigator to browse all classes, methods and functions in alphabetical order (as every class shows its methods this way at the beginning of its chapter).


3.1. Abbreviations and other Conventions

IDE: when used, we are talking about the Sikuli IDE. (  more Information )

PSMRL: means that either a Pattern, a string as a path to an image file or just representing plain text, a Match, a Region or a Location can be passed as a parameter.

PS: means that either a Pattern or a string as a path to an image file or just representing plain text can be passed as a parameter.

bundle-path: The IDE stores the images created with the IDE's capture tool together with the script in a bundle (technically, a folder myScript.sikuli or a zipped file in case of executable myScript.skl -  more Information), at the time the script is saved or exported as executable. The full path to this bundle when the script is running is called "bundle-path".
When importing other Sikuli scripts, contained images will be found without any additional notation (see Reuse of code and images).
Additionally you can use SIKULI_IMAGE_PATH to make images accessible from other directories.

path to an image file: A string that contains the valid file name of an existing image file. If the file cannot be found at runtime, a FileNotFound exception will be raised.

  • If a file name is given as a relative path to the file, it is taken as relative to the bundle-path.
  • An absolute path will be taken as such.
  • A valid URL (e.g.  http://web-page/imagelib/picture.png) can be used as path to an image file

When working in the IDE, valid image filenames are shown as their image's thumbnail. If the picture can not be found, the filename string is shown (no thumbnail at all). Same goes for situations where the filename can only be evaluated at runtime (e.g. as an expression: imagelib–path + "image.png").

Multi Monitor Environments: All descriptions in this document are based on the standard situation for only one Monitor (upper left corner of the screen is pixel position (0, 0)). However, Sikuli supports the configurations of multiple monitors. Please see the chapter Multi Monitor Environments for more details.

x, y, w, h: denote the attributes of a rectangle on the screen where x and y are the pixel coordinates of its upper left corner (position), and w and h are its width and its height (dimension).

Rectangle: When it is used as a parameter or a return value, Rectangle is an object of java's class Rectangle, which has four attributes x, y, width and height. The attributes can be accessed (readable and writable) by saying rect.x, rect.y, rect.width, and rect.height where rect is an instance of Rectangle.

x, y: denote a pixel position (x,y) on the screen, where (0,0) is the top-left corner.

True, False: are the two boolean constants as defined in the Python language.

None: means nothing as defined in the python language.

Index Base: As in most programming languages and libraries, the first element of a list or an array has the index 0. This applies to all the cases where more than one elements are accessible by using an index. For example, the primary screen in a multi-screen environment is accessible with Screen(0).

Classes: As a common convention, the name of classes are started with an uppercase letter. Therefore, saying "Region" means the class Region, whereas a region may be an existing instance/object of class Region or something else in its context. The words instance and object are used synonymously.

CONSTANTS: As a common convention, variables written completely in capital letters are constant values. Since Python does not hinder you to redefine any of the existing constants without giving any warning, we use this convention to at least get your attention that something written in capital letters should not be redefined by you. For instance, don't say Key.ENTER = "something I think it should be". On the other hand, you are actually free to do what you want, as long as you know what you are doing ;-)

Objects and its reference To keep things simple, we normally do not differentiate between the object as a concrete structure and a reference to it. If you don't know what we are talking about, don't worry. In most situations, you don't need to know what a reference really is. However, when you start to play around with copying variables, you should be aware that Python doesn't actually "copy" things. Here is an example showing what happens, when things get mixed up:

m = findAll(something) # returns a reference to an iterator object
mSaved = m # you may think you "save" the iterator object itself to mSaved ;-)
for gotIt in m: # going through the iterator
        print gotIt
mSaved.hasNext() # will return False, because mSaved and m are actually referencing the same object.

After the first loop, the iterator is empty (this is how it's defined). Therefore, since mSaved is just another reference to the same object, it is also empty. A possible solution for this dilemma (saving the content of an iterator of matches for later use) can be found here.

[ top of Document ]


3.2. Reading the Method and Function definitions

Each Method or Function is described using the following structure and notation:

  • method( parameter, [parameter], parameter1 | parameter2, *parameter )
  • parameter a description of parameter (for each parameter). Parameters in method or function calls are denoted in bold italic.
    • [parameter] means that this parameter can be omitted.
    • parameter1 | parameter2 means that either parameter1 or parameter2 can be used.
    • *parameter means that 0 or as many parameters as wanted/needed can be used.
    • If applicable, we talk about parameter default values in the parameter description
  • A description of what this method/function does. We use Windows, Mac and Linux to give hints on specialities.
  • returns: in this section we describe what the function/method may return. If this section is missing, nothing relevant is returned.
  • sideeffects: were applicable, we talk about.
# example

Sometimes we give examples with screenshots, which normally can be copied and pasted into Sikuli IDE.


[ top of Document ]


3.3. Whats new in Version X

Sikuli X is a new experimental branch of Sikuli. (X stands for eXperimental.)

For all current users of Sikuli 0.9 or 0.10 we recommend to upgrade to X.

However, please keep in mind some new features are still experimental, e.g. text recognition and the new API to get the bound of any windows, which means they may not work well or not support all platforms yet.

  • New computer vision engine - faster and more reliable
  • Better capture mode on Mac (supports multi-screens, no flicker anymore)
  • Text recognition and matching (*)
  • Screenshot Naming in the IDE:
    • screenshots can be automatically named
      • with timestamps
      • with part of the text found in them
      • manually at time of capture
    • and renamed every time using the preview pane
  • Remote Images are supported
    e.g. click("http://sikuli.org/example/ok_button.png")
  • There is an Image Search Path - images can be stored wherever you like (see Image Search Path)
  • Scripts can be imported from .sikuli sources as a module (Python style) (see Importing Sikuli Scripts)
  • New App Class replaces the old openApp, switchApp, closeApp functions (see Class App)
    • App.open(), App.close(), App.focus()
    • App.window() returns the bound of the app window as a Region, so you can restrict following actions within that region. (**)
  • Beautified Run in Slow Motion mode (see Controlling Sikuli Scripts)
  • Smooth mouse movement (see Controlling Sikuli Scripts)
  • More Special Keys are supported (PrintScreen, NumPad, CapsLock...)
  • New Region Highlighting: region.highlight() (**)
  • Mouse Wheel supported: wheel(target, WHEEL_UP | WHEEL_DOWN, steps) for scrolling the mouse wheel

(*) experimental
(**) experimental (Windows and Mac only)

[ top of Document ]


3.4. Other places to get Information

Links to places outside of the Sikuli project in the following are given without taking any responsibility for their contents.

About Sikuli

About Python and Jython

[ top of Document ]


4. Global Functions and Features

Accessing Sikuli Settings - Controlling Applications - exit - run - input - popup

getBundlePath - setBundlePath - setShowActions

This chapter describes global functions and features.

Table of Contents

[ Navigator ] - [ top of Document ]


4.1. Importing other Sikuli Scripts (reuse code and images)

When getting more experienced with scripting or when you are used to structure your solutions into a modular system, you might want to have access to the related features of the programming environment - in this case the Python/Jython features of module support - for your scripts too.

This is possible with Sikuli X:

  • import other .sikuli in a way that is fully compatible with Python import
  • automatically access images contained in the imported .sikuli (no need to use setBundlePath())

Note: Currently a .skl cannot be imported. As a circumvention it is up to you to unzip the .skl on the fly (e.g. with gzip on the command line) to a place of your choice as .sikuli (e.g. temp directory) and import it from there.

The prerequisites:

  • the directories/folders containing your .sikuli's you want to import have to be in sys.path (see below Usage)
  • your imported script must contain (recommendation: as first line) the following statement: from sikuli.Sikuli import *
    (this is necessary for the Python environment to know the Sikuli classes, methods, functions and global names)

Usage:

  • prepare sys.path (the example contains a recommendation to avoid double entries)
  • import your .sikuli using just it's name
# an example - choose your own naming
#on Windows
myScriptPath = "c:\\someDirectory\\myLibrary"
# on Mac/Linux
myScriptPath = "/someDirectory/myLibrary"

# all systems
if not myScriptPath in sys.path: sys.path.append(myScriptPath)

# supposing there is a myLib.sikuli
import myLib

# supposing myLib.sikuli contains a function "def myFunction():"
myLib.myFunction() # makes the call

Note on contained images: Together with the import, Sikuli internally uses the new SIKULI_IMAGE_PATH to make sure that images contained in imported .sikuli's are found automatically.

Some comments for readers not familiar with Python import

  • an import is only processed once (the first time it is found in the program flow). So be aware:
    • if your imported script contains code outside of any def()'s, then this code is only processed once at the first time, when the import is evaluated
    • since the IDE is not reset at rerun of scripts: when changing imported scripts while they are in use, you have to restart the IDE.
  • Python has a so called namespace concept: names (variables, functions, classes) are only known in it's namespace
    • your main script has it's own namespace
    • each imported script has i's own namespace (that's why you need from sikuli.Sikuli import *)
    • so names contained in an imported script have to be qualified with the module name (e.g. myLib.myFunction())
    • you may use from myLib import *, which integrates all names from myLib into your current namespace. So you can use myFunction() directly. When you decide to use this version, be sure you have a naming convention that prevents naming conflicts.

Another example: Importing from the same directory

This approach allows to develop a modularized script app that is contained in one directory. This directory can be moved around with no changes and even distributed as a zipped file.

# works on all platforms
p = getBundlePath()
slash = "\\" if Env.getOS() == OS.WINDOWS else "/"
myPath = p.rpartition(slash)[0] # gets the directory containing your running .sikuli
if not myPath in sys.path: sys.path.append(myPath)

# now you can import every .sikuli in the same directory
import myLib

[ Importing Sikuli Scripts ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


4.2. Controlling Sikuli Scripts and their Behavior

exit - setShowActions

Accessible attributes in class Settings:

DelayAfterDrag - DelayBeforeDrop - MoveMouseDelay - WaitScanRate - ObserveScanRate - SlowMotionDelay


MoveMouseDelay

As a standard behavior the time to move the mouse pointer from the current location to the target location given by mouse actions is 1.0 second. During this time, the mouse pointer is moved continuosly with decreasing speed to the target point. An additional benefit of this behavior is, that it gives the active application some time to react on the previous mouse action, since the e.g. click is simulated at the end of the mouse movement.

Settings.MoveMouseDelay - control the time taken for mouse movement to a target location by setting Settings.MoveMouseDelay to a decimal value meaning seconds (default 1.0). Setting it to 0 will switch off any animation (the mouse will "jump" to the target location).

mmd = Settings.MoveMouseDelay # save default/actual value
click(image1) # implicitly wait 1 second before click
Settings.MoveMouseDelay = 3
click(image2) # give app 3 seconds time before clicking again
Settings.MoveMouseDelay = mmd # reset to original value

[ Controlling Sikuli ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


DelayAfterDrag / DelayBeforeDrop

When using dragDrop() you may have situations, where the operation is not processed as expected. This may be due to the fact, that the Sikuli actions are too fast for the target application to react properly. With these settings the waiting time after the mouse down at the source location and before the mouse up at the target location of a dragDrop operation are controlled. The standard settings are 0.3 seconds for each value. The time that is taken, to move the mouse from source to target is controlled by Settings.MoveMouseDelay.

Settings.DelayAfterDrag - specify the waiting time after mouse down at the source location as a decimal value meaning seconds (default 0.3).

Settings.DelayBeforeDrop - specify the waiting time before mouse up at the target location as a decimal value meaning seconds (default 0.3).

# you may wish to save the actual settings before
Settings.DelayAfterDrag = 1
Settings.DelayBeforeDrop = 1
Settings.MoveMouseDelay = 3
dragDrop(source_image, target_image)
# time for complete dragDrop: about 5 seconds + search times

[ Controlling Sikuli ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


setShowActions( False | True ) / SlowMotionDelay

If set to True, when a script is run, Sikuli shows a visual effect (a blinking double lined red circle) on the spot where the action will take place before executing actions (e.g. click, dragDrop, type, etc) for about 2 seconds in the standard. The default setting is False.

Settings.SlowMotionDelay - you can control the duration of the effect by setting Settings.SlowMotionDelay to a decimal value meaning seconds (default 2.0).

Both settings are remembered independently from each other.

setShowActions(True)
Settings.SlowMotionDelay = 3
click(path_to_some_image)
# before clicking, the showActions effect
# will be visible for about 3 seconds
setShowActions(False)

[ Controlling Sikuli ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


WaitScanRate / ObserveScanRate

As a standard behavior Sikuli internally processes about 3 search operations per second, when processing a wait(), waitVanish(), exists() or observe(). In cases where this leads to an excessive usage of system ressources or if you intentionally want to look for the visual object not so often, you may set the respective values to what you need. Since the value is used as a rate per second, specifying values between 1 and near zero, leads to scans every x seconds (e.g. specifying 0.5 will lead to scans every 2 seconds).

Settings.WaitScanRate - set it to a decimal value > 0. A search will happen every 1/value seconds. (default 3) (exists() and waitVanish() are also affected).

Settings.ObserveScanRate - set it to a decimal value > 0. A search will happen every 1/value seconds. (default 3).

Both settings are remembered independently from each other.

def myHandler(e):
   print "it happened"
# you may wish to save the actual settings before
Settings.ObserveScanRate = 0.2
onAppear(some_image, myHandler)
observe(FOREVER, background = True)
# the observer will look every 5 seconds
# since your script does not wait here, you 
# might want to stop the observing later on ;-)

[ Controlling Sikuli ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


exit ([value])

Stops the script gracefully at this point. The value is returned to the calling environment.

[ Controlling Sikuli ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


4.3. Controlling Applications and their windows

Here we talk about opening or closing other applications, switching to them (bring their windows to front) or accessing an application's windows.

In this topic we have a special situation:

  • The three established generic methods openApp, switchApp and closeApp are still valid in the moment, but they should be seen as deprecated.
  • There is a new class App, that allows to have a specific application as an object with attributes and methods. This is the base for developing this topic further towards handling application windows: for now App.focusedWindow() and App.window() (not yet available on Linux).
  • We recommend to switch to the class App and its features, the next time you work with one of your existing scripts and in all cases, when developing new scripts.

This is a comparism of old and new functions:

App.open - openApp* - App.focus - switchApp* - App.close - closeApp*

App.focusedWindow** - App.window**

run

* method deprecated
** not yet available on Linux

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


Class App

Using class methods or instance methods

Generally you have the choice between using the class methods (e.g. App.open("application-identifier")) or first create an App instance and use the instance methods afterwards (e.g. myApp = App("application-identifier") and then later on myApp.open()). In the current state of the feature developement of the class App, there is no recomendation for a preferred usage. The only real difference is, that you might save some ressources, when using the instance approach, since using the class methods produces more intermediate objects.

How to create an App instance

The basic choice is to just say someApp = App("some-app-identifier") and you have your app instance, that you can later on use together with its methods, without having to specify the string again. Additionally App.open("some-app-identifier") and App.focus("some-app-identifier") return an app instance, that you might save in a variable to use it later on in your script.

Differences between Windows/Linux and Mac

Windows/Linux: Sikuli's strategy on these systems in the moment is to rely on implicit or explicit path specifications to find an application, that has to be started. Running "applications" can either be identified using their PID (process ID) or by using the window titles. So using a path specification will only switch to an open application, if the application internally handles the "more than one instance" situation".
You usually will use App.open("c:\\Program Files\\Mozilla Firefox\\Firefox.exe") to start Firefox. This might open an additional window. And you can use App.focus("Firefox") to switch to the frontmost Firefox window (which has no effect if no window is found). To clarify your situation you may use the new window() method, which allows to look for existing windows. The second possible approach is to store the App instance, that is returned by App.open(), in a variable and use it later on with the instance methods (see examples below).

If you specify the exact window title of an open window, you will get exactly this one. But if you specify some text, that is found in more than one open window title, you will get all these windows in return. So this is good e.g. with Firefox, where every window title contains "Mozilla Firefox", but it might be inconvenient when looking for "Untitled" which may be in use by different apps for new documents. So if you want exactly one specific window, you either need to know the exact window title or at least some part of the title text, that makes this window unique in the current context (e.g. save a document with a specific name, before accessing it's window).

On Mac OS X, on the system level the information is available, which windows belong to which applications. Sikuli uses this information. So by default using e.g. App.focus("Safari") starts Safari if not open already and switches to the application Safari if it is open, without doing anything with it's windows (the z-order is not touched). Additionally, you can get all windows of an application, without knowing it's titles.

Note on Windows: when specifying a path in a string, you have to use \\ (double backslash) for each \ (backslash)
e.g. myPath = "c:\\Program Files\\Sikuli-IDE\\Lib\\"

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


open( [application] )

Functionally equivalent to openApp( ''application'' ), where you can find examples how to specify application.

application: The name of an application (case-insensitive), that can be found in the path used by the system to locate applications, or the full path to an application (Windows: use double backslash \\ in the path string to represent a backslash) (Should be omitted, when used as instance method).

Opens the application application and brings its windows to front. Wether this switches to the already open application or opens a new instance of the application depends on the system and the application (see Class App).

returns: an App instance if the application could be opened or switched to and None if no success.

Class method usage: App.open("app-identifier")
Instance method usage: someApp.open() - needs someApp = App("app-identifier") before

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


focus( [application] )

Functionally equivalent to switchApp( ''application'' ), where you can find examples how to specify application.

application: The name of an application (case-insensitive) or (part of) a window title (Windows/Linux) (Should be omitted, when used as instance method).

Switches to application application and brings its windows (Windows/Linux: the identified window) to the front. If the application is not running, Sikuli will try to launch it using openApp().

Mac: if the app has more than one window opened, Sikuli will bring all of them to the front without changing their z-order. Windows/Linux: application is searched in the title text of all open windows. So if you are sure that your app is running, application need not be an application's name. Moreover, if the windows of one app have different titles, you can select one of those windows by specifying their title or a part of it.

returns: an App instance if the application could be opened or switched to and None if no success.

Class method usage: App.focus("app-identifier")
Instance method usage: someApp.focus() - needs someApp = App("app-identifier") before

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


close( [application] )

Functionally equivalent to closeApp( ''application'' ), where you can find examples how to specify application.

application: The name of an application (case-insensitive) or (part of) a window title (Windows/Linux) (Should be omitted, when used as instance method).

Closes the given application application or the matching windows (Windows/Linux). It does nothing if no opened window (Windows/Linux) or running app (Mac) can be found.

Class method usage: App.close("app-identifier")
Instance method usage: someApp.close() - needs someApp = App("app-identifier") before

Note for Windows/Linux: Wether the application itself is closed, depends on wether all open windows are closed or a main window of the app is closed, that in turn closes all other open windows.

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


focusedWindow()

Usable without an app-identifier: identifies the currently focused / frontmost window and switches to it.

returns: the region that the window currently occupies or None if there is no such window.

Note on Mac: When starting a script, Sikuli hides its window and starts processing the script. In this moment there is no focused window. You first have to click somewhere or use App.focus() to activate a focused window. So be aware, that focusedWindow() may return None.

Note on Windows: focusedWindow() always returns a region, even if there is no open window at all. This might e.g. be the task bar or the region of an icon on the desktop.

# highlight the currently fontmost window for 2 seconds
App.focusedWindow().highlight(2)

# save the windows region before
firstWindow = App.focusedWindow()
firstWindow.highlight(2)

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


window( [number] )

Sikuli looks for the number'th window of the given application (Mac) or series of windows with matching title (Windows/Linux) and returns the region it is currently occupying. The numbering starts with 0 and follows the z-order (0 is topmost). If the window does not exist, None is returned.

number: 0 or a positive integer number. If ommitted, 0 is taken as default.

returns: the region of the window, it currently occupies, if it exists and None otherwise.

Class method usage: App("app-identifier").window()
Instance method usage: someApp.window() - needs someApp = App("app-identifier") before

Windows examples:

# using an existing window if possible
myApp = App("Firefox")
if not myApp.window(): # no window(0) - Firefox not open
   App.open("c:\\Program Files\\Mozilla Firefox\\Firefox.exe"); wait(2)
myApp.focus(); wait(1)
type("l", KEY_CTRL) # switch to address field

# using a new window
firefox =  App.open("c:\\Program Files\\Mozilla Firefox\\Firefox.exe"); wait(2)
firefox.focus(); wait(1)
# now your just opened new window should be the frontmost 
with firefox.window(): # see the general notes below
   # some actions inside the window(0)'s region
firefox.close() # close the window - stop the process

Mac example: looping through all available app windows

# not more than 100 windows should be open ;-)
myApp = App("Safari")
for n in range(100):
        w = myApp.window(n)
        if not w: break # no more windows
        w.highlight(2) # window highlighted for 2 seconds

General Notes:

  • Be aware, that especially the window handling feature is experimental and under further development.
  • Especially on Windows be aware, that there might be many matching windows and windows, that might not be visible at all. Currently the window() function has no feature to identify a special window besides returning the region. So you might need some additional checks to be sure you are acting on the right window.
  • Windows/Linux: The close() function currently kills the application, without closing it's windows before. This is an abnormal termination and might be recognized by your application at the next start (e.g. Firefox usually tries to reload the pages).
  • Even if the windows are hidden/minimized, their region that they have in the visible state is returned. Currently there is no Sikuli feature, to decide wether the given window(n) is visible or not or if it is currently the frontmost window. The only guarentee: window()/window(0) is the topmost window of an application (Mac) or a series of matching windows (Windows/Linux).
  • Currently their are no methods available to act on such a window (resize, bring to front, get the window title, ...).

Some Tips together with the usage of the windows() feature:

  • check the position of a window's returned region: some apps hide there windows by giving them "outside" coordinates (e.g. negative)
  • check the size of a window's returned region: normally your app windows will occupy major parts of the screen, so a window's returned region of e.g. 150x30 might be some invisible stuff or an overlay on the real app window (e.g. the "search in history" input field on the Safari Top-Sites page, which is reported as windows(0))
  • if you have more than one application window, try to position them at different coordinates, so you can decide which one you act on in the moment
  • use the new region.text() feature to extract the window title

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


openApp( application )

Functionally equivalent to App.open( ''application'' ).

application: The name of an application (case-insensitive), that can be found in the path used by the system to locate applications, or the full path to an application (Windows: use double backslash \\ in the path string to represent a backslash)

Opens the application application and brings its windows to front. Wether this switches to the already open application or opens a new instance of the application depends on the system and the application.

openApp("cmd.exe") # Windows: found through PATH
openApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe") # windows: full path specified
openApp("Safari") # Mac: opens Safari

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


switchApp( application )

Functionally equivalent to App.focus( ''application'' ).

application: The name of an application (case-insensitive) or (part of) a window title (Windows/Linux).

Switches to application application and brings its windows (Windows/Linux: the identified window) to the front. If the application is not running, Sikuli will try to launch it using openApp().

Mac: if the app has more than one window opened, Sikuli will bring all of them to the front without changing their z-order. Windows/Linux: application is searched in the title text of all open windows. So if you are sure that your app is running, application need not be an application's name. Moreover, if the windows of one app have different titles, you can select one of those windows by specifying their title or a part of it.

switchApp("cmd.exe") # Windows: switches to open command prompt or starts one
switchApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe") # windows: opens a new browser window !! (since text cannot be found in the window title)
switchApp("mozilla firefox") # windows: switches to the frontmost open browser window (no window open: does nothing !!)
switchApp("Safari") # Mac: switches to Safari or starts it

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


closeApp( application )

Functionally equivalent to App.close( ''application'' ).

application: The name of an application (case-insensitive) or (part of) a window title (Windows/Linux).

Closes the given application application or the matching windows (Windows/Linux). It does nothing if no opened window (Windows/Linux) or running app (Mac) can be found.

Note for Windows/Linux: Wether the application itself is closed, depends on wether all open windows are closed or a main window of the app is closed, that in turn closes all other open windows.

closeApp("cmd.exe") # Windows: closes an open command prompt
closeApp("c:\\Program Files\\Mozilla Firefox\\firefox.exe") # windows: does nothing, since text cannot be found in the window title
closeApp("mozilla firefox") # windows: stops firefox including all its windows
closeApp("Safari") # Mac: closes Safari including all its windows

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


run( command )

command: a command, that can be run from the command line.

Executes the command command. The script waits for completion.

[ Handling Applications ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


4.4. Interacting with the User

input - popup


text: a string that is used as a message

Displays a dialog box with an Ok button and text as message. The script waits for the user to click Ok.

popup("Hello World!\nHave fun with Sikuli!") # \n can break a line.

[ Interacting with the User ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


input( [text] )

text: a string that is used as a message. If omitted, it is left blank.

Displays a dialog box with an input field, a Cancel button, an OK button and text as message. The script waits for the user to click either Cancel or Ok.

returns:

  • None if the user clicks Cancel.
  • the text that the user entered into the input field, if Ok is clicked. An empty string is returned, if nothing was entered.
name = input("Please enter your name to log in:") # we save the returned text into a variable "name" for later use (e.g. paste())

[ Interacting with the User ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


4.5. General Settings and Access to Environment Information

getClipboard - getOS - getOS - getBundlePath - setBundlePath

Accessing Settings

Sikuli Level

Sikuli internally uses it's class Settings to store globally used settings. Publicly available attributes may be accessed by using Settings.attribute to get it's value and Settings.attribute = value to set it. It is highly recommended to only modify attributes, that are described in this document or when you really know, what you are doing.

Actually all attributes of some value for scripting are described in the topic Controlling Sikuli Scripts and their Behavior


Jython/Python Level

You may use all settings, that are defined in standard Python/Jython and that are available in your system environment. The modules sys and time are already imported, so you can use their methods without the need for an import statement.

sys.path may be one of the most valuable settings, since it is used by Python/Jython to locate modules, that are referenced using import module. It is a list of path's, that is e.g. maintained by Sikuli to implement Importing other Sikuli Scripts (reuse code) as a standard compliant feature.

If you want to use sys.path, it is recommended to do it as shown in the following example, to avoid appending the same entry again:

myPath = "some-absolute-path"
if not myPath in sys.path: sys.path.append(myPath)

Java Level

Java maintains a global storage for settings (key/value pairs), that can be accessed by the program/script. Sikuli uses it too for some of it's settings. Normally it is not necessary to access these settings at the Java level from a Sikuli script, since Sikuli provides getter and setter methods for accessing values, that make sense for scripting. One example is the list of paths, that Sikuli maintains to specify additional places to search for images (look at: Importing other Sikuli Scripts (reuse code and images)).

If needed, you may access the java settings storage as shown in the following example:

import java
val = java.lang.System.getProperty("key-of-property") # get a value
java.lang.System.setProperty("key-of-property", value) # set a property's value

[ General Settings ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


Image Search Path

On the Java level Sikuli maintains a list of locations to search for images, that are not found in the current .sikuli folder (bundle path).

It is automatically extended by Sikuli with script folders, that are imported (see: Reuse of Code and Images), so their contained images can be accessed.

You may inspect and change this list on your own using the following functions. But be careful if you are using the import feature in parallel - do not remove locations that are still needed.

getImagePath(): returns a list/array containing the current entries

addImagePath( "some-path" ): adds the path entry "some-path" at the end of the current list and returns None.

removeImagePath( "some-path" ): removes the path entry "some-path" from the current list and returns None.

"some-path" has to be a string containing a path specification for the platform you are on (Windows: use double backslashes \\).

If you want to be sure of the results of your manipulations, you have to use getImagPath() and check the returned list.

When searching images, the path's are scanned in the order of the list. The first image file with a matching image name is used.

Note: Behind the scenes this list is maintained in the java property store with the key SIKULI_IMAGE_PATH. This can be preset when starting the JVM using the environment variable SIKULI_IMAGE_PATH and can be accessed at runtime using the approach as mentioned under Accessing Settings - Java level. Be aware, that this is one string, where the different entries are separated with a colon ( : ).

[ General Settings ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


setBundlePath( path-to-a-folder )

path-to-a-folder a fully qualified path to a folder containing your images used for finding patterns. Windows: use double backslashes.

Sets the path for searching images in all Sikuli Script methods. Sikuli IDE sets this automatically to the path of the folder where it saves the script (.sikuli). Therefore, you should use this function only if you really know what you are doing. Using it generally means that you would like to take care of your captured images by yourself.

Additionally images are searched for in the SIKULI_IMAGE_PATH, that is a global list of other places to look for images. It is implicitly extended by script folders, that are imported (see: Reuse of Code and Images).

[ General Settings ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


getBundlePath()

returns: a string containing a fully qualified path to a folder containing your images used for finding patterns. Note: Sikuli IDE sets this automatically to the path of the folder where it saves the script (.sikuli). You may use this function if, for example, to package your private files together with the script or to access the picture files in the .sikuli bundles for other purposes. Sikuli only gives you to access to the path name, so you may need other python modules for I/O or other purposes.

Other places, where Sikuli looks for images, might be in the SIKULI_IMAGE_PATH.

[ General Settings ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


getOS(), getOSVersion()

  • Env.getOS() returns the type of system your script is running (a static method of Class Env).

returns: one of the following constants: OS.MAC, OS.WINDOWS, OS.LINUX.

  • Env.getOSVersion() returns the version number of the system the script is running on. It does not tell you the type of system as such (use getOS())

returns: a string

The example shows how to check where you are.

# on a Mac
myOS = Env.getOS() # you have a reference to the constant OS.MAC
myVer = Env.getOSVersion() # the specific version number

if myOS == OS.MAC:
   print "Mac " + myVer # prints e.g.: Mac 10.6.3
else:
   print "Sorry, not a Mac"

myOS = str(Env.getOS()) # you have a reference to a string containing "MAC"
if myOS == "MAC" or myOS.startswith("M"): # to show the possibilities
   print "Mac " + myVer # prints e.g.: Mac 10.6.3
else:
   print "Sorry, not a Mac"

[ General Settings ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


getClipboard()

Usage: Env.getClipboard() (a static method of Class Env)

returns: the content of the Clipboard if it is text, otherwise an empty string.

Note: Be careful, when using getClipboard() together with paste(): since paste internally uses the clipboard to transfer text to other applications, the clipboard will contain what you just pasted. Therefore, if you need the content of the clipboard, you should call getClipboard() before using paste().

Tip: When the clipboard content was copied from a web page that mixes images and text, you should be aware, that there may be whitespace characters around and inside your text, that you did not expect. In this case, you can use Env.getClipboard().strip() to get rid of surrounding whitespaces.

[ General Settings ] - [ Global Functions ] - [ Navigator ] - [ top of Document ]


5. Class Region

Table of Contents

[ Navigator ] - [ top of Document ]


A region is a rectangular section on a screen, which is defined by

  1. a location (x, y) of its upper left corner as a distance relative to the upper left corner of the screen (0, 0),
  2. and its dimension (w, h) as its width and height.

x, y, w, h are integer numbers counting a distance in pixels.

A region knows nothing about its visual content (windows, pictures, graphics, text, ...). It only knows the position on the screen and its dimension.

New regions can be created, based on an existing region: you can extend a region in all directions or get the adjacent rectangle up to the bounds of the screen horizontally or vertically.

The visual and textual content of a region can be evaluated by using methods like find(), which look for a given rectangular pixel pattern or text string within the region. The matching content in the region has a similarity between 0 (not found) and 1 (found and it is per pixel exactly matches to the pattern). The find can be advised, to search with a minimum similarity, so that some minor variations in shape and color can be ignored. If nothing else is specified, Sikuli searches with a minimum similarity of 0.7, which does what is expected in general cases.

Find operations return a match, which has all the attributes and methods as a region has and can be used in exactly the same way (e.g. to find or click another target within it). A match has the dimension of the pattern used for searching and also knows the position where it was found and its similarity score. A region preserves the best match of the last successful find operation and all matches of the last successful findAll() (accessible with getLastMatch()/getLastMatches()). You can wait for patterns to show up using wait(), to vanish using waitVanish(), or just check whether a pattern exists without handling exceptions.

Sikuli X supports visual event driven programming. You can tell a region to observe that something appears, vanishes, or changes. It's possible to wait for the completion of an observation or let it run in the background while your script is continuing. When one of the visual events happens, a handler in your script is called. Each region has one observer and each observer can handle multiple visual events. It's your responsibility to stop an observation.

You can act on a region by simulating mouse and keyboard actions. You have the choice between acting on spots that are evaluated from previous finding operations (e.g. a match), or act on a pattern that will be found implicitly. To simulate more complicated actions with special applications like games or graphics, low level actions for mouse and keyboard are available.

To support even more sophisticated and robust scripts, you may want to handle the FindFailed exception, which is thrown when a find operation is failed.

Operations that target on the same region can be grouped together using Python's "with" statement.

Screen is another class that inherits all attributes and methods of class Region. Through class Screen, you can access to the screen dimensions and different monitors, if you have more than one. It also gives you a default region to act on without specify one. Normally you would have to say region.find(image), where region is usually the whole screen. So as a convenience, saying find(image) would simply act on the whole screen (actually the default/primary screen), without specifying the default SCREEN every time. On the other hand, this may slow down searching speed, because finding a target on the whole screen is time-consuming. Therefore, to speed up processing, saying region.find() can restrict the search to a specified smaller rectangle that is usually the application's main window you are interested. Another possibility is to use setROI() to restrict the search for all following find operations to a smaller region than the whole screen.


PSMRL: means that either a Pattern, a string as a path to an image file or representing plain text, a Match, a Region or a Location can be passed as a parameter.

PS: means that either a Pattern, or a string as a path to an image file or representing plain text can be passed as a parameter.

applicable for Screen or Match or Screen and Match Though all methods of class Region can be used with objects of the classes Screen and Match, not every method makes really sense with these classes ( e.g. Sccreen(0).below(), find(PS).setRect(rectangle) ). So if mentioned, this means that the usage of this method with a screen object and/or a match object makes sense in applicable situations.

Note on Multi Monitor Environments: In situations where more than one monitor are activated and you want to act with regions that are not located on the default screen (primary monitor in this case), special aspects have to be taken into account. Therefore, before starting to work with Sikuli on more than one monitor, please read the chapter Multi Monitor Environments at first.

[ Navigator ] - [ top of Document ]


5.1. Methods of Region

above - below - click - drag - dragDrop - dropAt - doubleClick - exists - find - findAll - getCenter - getH - getLastMatch - getLastMatches - getRect - getROI - getScreen - getW - getX - getY - highlight - hover - inside - keyDown - keyUp - left - mouseDown - mouseMove - mouseUp - nearby - observe - onAppear - onVanish - onChange - paste - right - rightClick - selectRegion - setAutoWaitTimeout - setH - setRect - setROI - setThrowException - setW - setX - setY - stopObserver - text - type - wait - waitVanish

Note: In case of having more than one Monitor active, read Multi Monitor Environments before.

[ Navigator ] - [ top of Document ]


5.2. Creating a Region, Setting and Getting Attributes

getCenter - getH - getLastMatch - getLastMatches - getRect - getROI - getScreen - getW - getX - getY - selectRegion - setAutoWaitTimeout - setH - setRect - setROI - setW - setX - setY

In this chapter, you can find information on how to create a new region object. Some of the attributes of a region object can be accessed directly or via a method call. Here you will find the HowTo's.


Region( x, y, w, h ), Region( region ), Region( Rectangle )

x, y, w, h: the attributes of a rectangle.

region: an existing region object.

rectangle: an existing object of java class Rectangle.

In addition to creating a region by using the tool provided by the IDE, a region can be created by specifying a rectangle. This is how the visual representation in the IDE of such a region is internally set up in the script. A region can also be created by users in run-time using selectRegion().

You can create a region by given another region. This just duplicates the region into a different and new object. This can be useful, if you need the same region with different attributes, such as observation loop or whether throwing an exception when finding fails. Another way to create a region is to specify a rectangle object or to extend an existing region.

returns: a new region object

myReg = Region(0, 0, 500, 500) # a new region in upper left part of screen, with width=500 and height=500
myReg1 = Region( myReg ) # new region with the same rectangle as myReg

# using a rectangle
rect = myReg1.getRect() # returns a rectangle object
rect.x += 100 # adds 100 to its x coordinate
myReg2 = Region( rect )

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


selectRegion( [text] )

In fact, selectRegion() is a method of Class Screen, but since it creates a region, it's mentioned here too.

text: is displayed for about 2 seconds in the middle of the screen. If text is omitted, the default "Select a region on the screen" is displayed.

The interactive capture mode is entered and allows the user to select a region the same way as using the selection tool in the IDE. You may have to check the result, since the user may cancel the capturing.

returns: a new region object or None, if the user cancels the capturing process.

applicable for Screen

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


setX( number ), setY( number ), setW( number ), setH( number )

number: an integer value

The respective attribute of the region is set to the new value. This effectively moves the region around and/or changes its dimension.

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


getX(), getY(), getW(), getH()

returns: the respective attribute of the region as an integer value.

applicable for Screen and Match

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


setROI( x, y, w, h | rectangle ), setRect( x, y, w, h | rectangle )

x, y, w, h: the attributes of a rectangle

rectangle: a rectangle object

Both methods are doing exactly the same: setting position and dimension to new values. The motivation for two names is to make scripts more readable: setROI() is intended to shrink a screen object to speed up processing searches (region of interest), whereas setRect() should be used to redefine a region (which could be enlarging it).

returns: None

applicable for Screen

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


getROI(), getRect()

Both are doing the same: for motivation to have two, see above.

returns: the rectangle attributes of the region as a rectangle object

applicable for Screen and Match

myRect = myReg.getRect() # myRect is an object of java class Rectangle
(x, y, w, h) = (myRect.x, myRect.y, myRect.width, myRect.height) # the 4 attributes are assigned to different variables
print getROI() # shows "java.awt.Rectangle[x=0,y=0,width=1280,height=800]" (current region of interest of primary screen)

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


getCenter()

Returns the pixel position of the center of that region: x + w/2, y + h/2. Fractions are rounded down to integer.

returns: an object of class Location.

applicable for Screen and Match

loc = Region(0, 0, 101, 201).getCenter() # loc will contain position (50, 100)

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


getScreen()

Returns the screen object that contains this region. It only makes sense in Multi Monitor Environments, since it always return the default screen in a single monitor environment.

returns: a new screen object

applicable for Match

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


getLastMatch(), getLastMatches()

All successful find operations ( explicit like find() or implicit like click() ), store the best match into lastMatch of the region that was searched. findAll() store all found matches into lastMatches of the region that was searched as an iterator.

To access these attributes use region.getLastMatch() or region.getLastMatches() respectively.

How to use the iterator object returned by getLastMatches() is documented here.

returns: the best match as a match object respectively one or more match objects as an iterator object.

applicable for Screen and Match

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


setAutoWaitTimeout( seconds ) / getAutoWaitTimeout()

  • setAutoWaitTimeout( seconds )

seconds: a number, which can have a fraction. The internal granularity is milliseconds.

The default is 3.0 seconds.

Sets the maximum waiting time in seconds for all following find operations in the respective region (by default SCREEN is used, if region is not specified). It enables all find operations to wait for the given pattern to appear until the specified amount of time has elapsed. As such it lets find() work like wait(), without being able to set an individual timeout value for a specific find operation.

  • getAutoWaitTimeout()

returns: the current value of the maximum waiting time for all following find operations in the respective region (by default SCREEN is used, if region is not specified).

[ Creating a Region, Setting and Getting Attributes ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.3. Extending a Region

above - below - inside - left - nearby - right

These methods (exception inside()) return a new region object, that is constructed based on the specified region (sometimes called spatial operators). The range parameter, if given as positive integer number, restricts the dimension of the new region (width and/or height respectively) based on that value. If range is not specified, the new region extends to the respective boundary of the screen the given region belongs to. An exception is nearby(), which uses 50 as its default range.

Note: In all cases the new region does not extend beyond any boundary of the screen that contains the given region.


inside()

inside() can be used to make scripts more readable. region.inside().find() is totally equivalent to region.find().

returns: the region itself

applicable for Match

[ Extending a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


nearby( [range] )

range: positive integer number

The new region is extended in all directions by range pixels with the same center as the given region. If not given, range is set to 50 pixels by default.

returns: a new region (have a look)

applicable for Match

[ Extending a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


above( [range] )

The new region is constructed above the given one starting from its top border extending range pixels towards the top boundary of the screen containing the given region. If range is omitted, it extends to the boundary of the screen. It has the same width as the given region and the same rectangle x-value.

returns: a new region (have a look)

applicable for Match

[ Extending a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


below( [range] )

The new region is constructed below the given one starting from its bottom border extending range pixels towards the bottom boundary of the screen containing the given region. If range is omitted, it extends to the boundary of the screen. It has the same width as the given region and the same rectangle x-value.

returns: a new region (have a look)

applicable for Match

[ Extending a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


left( [range] )

The new region is constructed left of the given one starting from its left border extending range pixels towards the left boundary of the screen containing the given region. If range is omitted, it extends to the boundary of the screen. It has the same height as the given region and the same rectangle y-value.

returns: a new region (have a look)

applicable for Match

[ Extending a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


The new region is constructed right of the given one starting from its right border extending range pixels towards the right boundary of the screen containing the given region. If range is omitted, it extends to the boundary of the screen. It has the same height as the given region and the same rectangle y-value.

returns: a new region (have a look)

applicable for Match


5.4. Finding inside a Region and Waiting for a Visual Event

exists - find - findAll - wait - waitVanish

In addition to acting on visual objects, finding them is also one of the core functions of Sikuli. As a default, if the visual object cannot be found, Sikuli will stop the script by raising an exception FindFailed. This follows the standards of the Python language so that you can handle such exceptions using  try ... except. If you are not used to programming and just want to bypass the exception handling, you can read this section about exception FindFailed.

PS: means, that either a Pattern or a string (path to an image file or just plain text) can be used as parameter. A find operation is successful, if the given image is found with the given minimum similarity or the given text is found exactly. Similarity is a value between 0 and 1 to specify how likely the given image looks like the target. By default, the similarity is 0.7 if an image rather than a pattern object with a specific similarity is given to find().

If a find operation is successful, the returned match is additionally stored internally with the region that was used for the search. So instead of using a variable to store the match (m = find()), you can use getLastMatch() to access it afterwards. Unsuccessful find operations will leave these values unchanged. (This only happens when using exists(), exception handling or when running with setThrowException(''False'')).

Normally all these region methods are used as reg.find(PS), where reg is a region object. If written as find(PS) it acts on the default screen, which is an implicit region in this case. But sometimes it's a good idea to use region.find() to restrict the search to a smaller region in order to speed up processing.

If you have multiple monitors, please read Multi Monitor Environments.

Note on IDE: Capturing is a tool in the IDE, to quickly set up images to search for. These images are named automatically by the IDE and stored together with the script, at the time it is saved (we call the location in the file system bundle-path). Behind the curtain, the images itself are specified simply by using a string containing the file name (path to an image file).


find( PS )

PS: a pattern object or a string (path to an image file or just plain text)

Looks for a particular GUI element, which is seen as the given image or text. The given file name of an image specifies the element's appearance. It searches within the region and returns the best match, which shows a similarity greater than the minimum similarity given by the pattern. If no similarity was set for the pattern by Pattern.similar() before, a default minimum similarity of 0.7 is set automatically. If no match is found with the minimum similarity or greater, the find fails (raises exception FindFailed or returns None).

If autoWaitTimeout (setAutoWaitTimeout()) is set to a non-zero value, find() just acts as a wait().

returns: a match object that contains the best match. In case that exception handling for FindFailed is switched off by setThrowException(False), None is returned if nothing is found. (Note: By default, the exception handling of FindFailed is turned on).

Sideeffects:

  • FindFailed: If the find fails (no match, whose similarity is equal or greater than the minimum similarity of the pattern, can be found) and exception handling is turned on (which is the the default), an exception FindFailed is raised. If the script does not handle the exception, the script is stopped, with a message about the exception.
  • lastMatch: the best match can be accessed using getLastMatch() afterwards

applicable for Screen and Match

[ Finding inside a Region and Waiting for a Visual Event ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


findAll( PS )

PS: a pattern object or a string (path to an image file or just plain text)

Repeatedly looks for the pattern, until no match can be found anymore, that meets the requirements for a single find() with the specified pattern ( look for details ).

returns: one ore more match objects as an iterator object. How to iterate through is documented here. In case that exception handling for FindFailed is switched off by setThrowException(False) None is returned, if nothing is found. (Note: By default at script start exception handling of FindFailed is turned on).

Sideeffects:

  • FindFailed: If the find fails (no match can be found, whose similarity is equal or greater than the minimum similarity of the pattern) and exception handling is turned on (which is the default) an exception FindFailed is raised. If the script does not handle the exception, the script is stopped, with a message about the exception.
  • lastMatches: a reference to the returned iterator object containing the found matches is stored with the region that was searched. It can be accessed using getLastMatches() afterwards. How to iterate through an iterator of matches is documented here.

applicable for Screen and Match

[ Finding inside a Region and Waiting for a Visual Event ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


wait( [PS], [seconds] )

PS: a pattern object or a string (path to an image file or just plain text)

seconds: a number, which can have a fraction, as maximum waiting time in seconds. The internal granularity is milliseconds. If not specified, the auto wait timeout value is used. Use the constant FOREVER to wait for an infinite time.

If PS is not specified, the script just pauses for the specified amount of time. ( it's still possible to use sleep( seconds ) instead, but this is deprecated. )

PS specified: Keeps searching the given pattern in the region until the image appears ( would have been found with find(PS) ) or the specified amount of time has elapsed. At least one find operation is performed, even if 0 seconds is specified.

returns: a match object that contains the best match. If exception handling for FindFailed is switched off by setThrowException(False), None is returned, if nothing is found within the specified waiting time. (Note: By default, the exception handling of FindFailed is turned on.)

Sideeffects:

  • FindFailed: If the find fails (no match can be found, whose similarity is equal or greater than the minimum similarity of the pattern within the specified waiting time) and exception handling is turned on (which is the default), an exception FindFailed is raised. If the script does not handle the exception, the script is stopped with a message about the exception.
  • lastMatch: the best match can be accessed using getLastMatch() afterward.

Note: You may adjust the scan rate (how often a search during the wait takes place) by setting Settings.WaitScanRate appropriately.

applicable for Screen and Match

[ Finding inside a Region and Waiting for a Visual Event ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


waitVanish( PS, [seconds] )

PS: a pattern object or a string (path to an image file or just plain text)

seconds: a number, which can have a fraction, as maximum waiting time in seconds. The internal granularity is milliseconds. If not specified, the actual auto wait timeout value is used. Use the constant FOREVER to wait for an infinite time.

Keeps searching the given pattern in the region until the image vanishes (can not be found with find(PS) any longer) or the specified amount of time has elapsed. At least one find operation is performed, even if 0 seconds is specified.

returns:

  • True if the pattern vanishes within the specified waiting time.
  • False if the pattern stays visible until the waiting time has elapsed.

Note: You may adjust the scan rate (how often a search during the wait takes place) by setting Settings.WaitScanRate appropriately.

applicable for Screen and Match

[ Finding inside a Region and Waiting for a Visual Event ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


exists( PS, [seconds] )

PS: a pattern object or a string (path to an image file or just plain text)

seconds: a number, which can have a fraction, as maximum waiting time in seconds. The internal granularity is milliseconds. If not specified, the actual auto wait timeout value is used. Use the constant FOREVER to wait for an infinite time.

Does exactly the same as wait(), but no exception is raised in case of FindFailed. So it can be used to symplify scripting in case that you only want to know wether something is there or not to decide how to proceed in your workflow. So it is typically used with an if statement. At least one find operation is performed, even if 0 seconds is specified. So specifying 0 seconds saves some time, in case there is no need to wait, since its your intention to get the information "not found" directly.

returns: a match object that contains the best match. None is returned, if nothing is found within the specified waiting time.

Sideeffects:

  • lastMatch: the best match can be accessed using getLastMatch() afterwards

Note: You may adjust the scan rate (how often a search during the wait takes place) by setting Settings.WaitScanRate appropriately.

applicable for Screen and Match

[ Finding inside a Region and Waiting for a Visual Event ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.5. Observing Visual Events in a Region

observe - onAppear - onVanish - onChange - stopObserver

This feature allows to some extent the implementation of visual event driven programming.

You can tell a region to observe that something appears or vanishes or the content changes at all. Using the methods onAppear(), onVanish() and onChange(), you register an event observer that starts its observation when you call observe(). Each region object can have exactly one observation active and running. For each observation, you can register as many event observers as needed. So you can think of it as grouping some wait() and waitVanish() together and have them processed simultanouesly, while you are waiting for one of these events to happen.

It's possible to let the script wait for the completion of an observation or let it run in the background, while your script is continuing. With a timing parameter you can tell observe() to stop observation anyway after the given time.

When one of the visual events happens, an event handler written by you is called. An event handler is a function contained in your script and expects an event object as a parameter. During the processing in your handler, the observation is paused until your handler has ended. Information between the main script and your handlers can be given forward and backward using global variables.

It's your responsibility to stop an observation. This can either be done by calling region.stopObserver() or by starting observe() with a timing parameter.

Since you can have as many region objects as needed and each region can have one observation active and running. So theoretically it's possible to have as many visual events being observered at the same time as needed. But in reality, the number of observation is still limited by the system resources available to Sikuli at that time.

Be aware, that every observation is a number of different find operations that are processed repeatedly. So to speed up processing and keep your script acting, you may want to define a region for observation as small as possible. You may adjust the scan rate (how often a search during the observation takes place) by setting Settings.ObserveScanRate appropriately.


PS: means, that either a Pattern or a String (path to an image file or just plain text) can be used as parameter.

handler: as a parameter in the following methods, you have to specify the name of a function, which will be called by the observer, in case the observed event happens. The function itself has to be defined in your script before using the method that references the function. The existance of the function will be checked before starting the script. So to get your script running, you have to have at least the following statements in your script:

def myHandler(event): # you can choose any valid function name
        # event: can be any variable name, it references an event object
        pass # add your statements here

onAppear("path-to-an-image-file", myHandler) # or any other onEvent()
observe(10) # observes for 10 seconds

Normally all the region methods are used as reg.onAppear(PS), where reg is a region object. If written as onAppear(PS) the different repeatedly performed implicit find operations operate on the default screen being the implicit region in this case. But using region.onEvent() will restrict the search to the region's rectangle and speed up processing, if region is significantly smaller than the whole screen.

Note: In case of having more than one Monitor active, read Multi Monitor Environments before.

Note on IDE: Capturing is a tool in the IDE, to quickly set up images to search for. These images are named automatically by the IDE and stored together with the script, at the time it is saved (we call the location in the file system bundle-path). Behind the curtain the images itself are specified by using a string containing the file name (path to an image file).

[ Navigator ] - [ top of Document ]


onAppear( PS, handler )

PS: a pattern object or a string (path to an image file or just plain text)

handler: the name of a function contained in your script.

With the given region you register an observer, that should wait for the pattern to be there or to appaear and is activated with the next call of observe(). In the moment the internal find operation on the given pattern is successful during observation, your handler is called and the observation is paused until you return from your handler.

applicable for Screen and Match

[ Observing Visual Events in a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


onVanish( PS, handler )

PS: a pattern object or a string (path to an image file or just plain text)

handler: the name of a function contained in your script

With the given region you register an observer, that should wait for the pattern to be not there or to vanish and is activated with the next call of observe(). In the moment the internal find operation on the given pattern fails during observation, your handler is called and the observation is paused until you return from your handler.

applicable for Screen and Match

[ Observing Visual Events in a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


onChange( handler )

handler: the name of a function contained in your script

With the given region you register an observer, that should wait for the visual content of the given region to change and is activated with the next call of observe(). In the moment the visual content changes during observation, your handler is called and the observation is paused until you return from your handler.

applicable for Screen and Match

[ Observing Visual Events in a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


observe( [seconds], [ background= False | True ] )

seconds: a number, which can have a fraction, as maximum observation time in seconds. Omit it or use the constant FOREVER to tell the observation to run for an infinite time (or until stopped by a call of stopObserve()).

background= False | True : when set to True, the observation will be run in the background and processing of your script is continued immediately. Otherwise the script is paused until the completion of the observation.

When called, the observation is started. For each region object, only one observation can be running at a given time.

Note: You may adjust the scan rate (how often a search during the observation takes place) by setting Settings.ObserveScanRate appropriately.

applicable for Screen and Match

[ Observing Visual Events in a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


stopObserver()

When called, a running observation of the relevant region is stopped.

  • as reg.stopObserver(), reg must be a valid reference to a region object.
  • If you are not sure, wether you have access to this reference or do not know which one to use (e.g. using the same handler for different observations) you can use event.region.stopObserver, where event is the one parameter that is defined with the handler (can be any other valid variable name). Obviously only availble inside a handler.

applicable for Screen and Match

[ Observing Visual Events in a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.6. Acting on a Region

click - drag - dragDrop - dropAt - doubleClick - highlight - hover - paste - rightClick - type

Besides finding visual objects on the screen, acting on these elements is one of the kernel operations of Sikuli. Mouse actions can be simulated as well as pressing keys on a keyboard.

The place on the screen, that should be acted on (in the end just one specific pixel, the click point), can be given either as a pattern PS like with the find operations or by directly referencing a pixel as center of a region object (match or screen also), the target pixel connected with a match or a pixel location. Since all these choices can be used with all action methods as needed, they are abbreviated and called like this:

PSMRL: which means, that either a Pattern or a string (path to an image file or just plain text) or a Match or a Region or a Location can be used as parameter, in detail:

  • P: pattern: a pattern object. An implicit find operation is processed first. If successful, the center of the resulting matches rectangle is the click point. If the pattern object has a target offset specified, this is used as click point instead.
  • S: string: a path to an image file or just plain text. An implicit find operation with the default minimum similarity 0.7 is processed first. If successful, the center of the resulting match object is the click point.
  • M: match: a match object from a previous find operation. If the match has a target specified it is used as the click point, otherwise the center of the matches rectangle.
  • R: region: a region object whose center is used as click point.
  • L: location: a location object which by definition represents a point on the screen that is used as click point.

modifiers It is possible to press the so called key modifiers together with the mouse operation or when simulating keyboard typing. The respective parameter is given by one or more predefined constants. If more than one modifier is necessary, they are combined by using "+" or "|".

Be aware: when using PS as parameter, internally there has to be processed an implicit find operation before, so the aspects of find operations and of find(''PS'') apply. If the find operation was successful, the match that was acted on, can be recalled using getLastMatch(). Especially remember the fact, that as a default, Sikuli will stop the script, if the visual object cannot be found, by raising an exception FindFailed. This is done according to the standards of the Python language, which allows you to handle such exceptions. If you are not really used to programming and do not have a good knowledge of Python, it may be helpful, to first read about exception FindFailed.

Normally all these region methods are used as reg.click(PS), where reg is a region object. If written as click(PS) the implicit find is done on the default screen being the implicit region in this case. But using region.click() will restrict the search to the region's rectangle and speed up processing, if region is significantly smaller than the whole screen.

Note on IDE: Capturing is a tool in the IDE, to quickly set up images to search for. These images are named automatically by the IDE and stored together with the script, at the time it is saved (we call the location in the file system bundle-path). Behind the curtain the images itself are specified by using a string containing the file name (path to an image file).

Note: If you need to implement more sophisticated mouse and keyboard actions look at Low Level Mouse and Keyboard Actions.

Note: In case of having more than one Monitor active, read Multi Monitor Environments before.

Note on Mac: it might be necessary, to use switchApp() before, to prepare the application for accepting the action

[ Navigator ] - [ top of Document ]


click( PSMRL, [modifiers] )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

modifiers: one or more key modifiers (read for details)

Performs a mouse click on the click point using the left button.

returns: the number of performed clicks (actually 1). A 0 (integer null) means that because of some reason, no click could be performed. This would be the case, if using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffect: when using PS, the match can be accessed using getLastMatch() afterwards

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


doubleClick( PSMRL, [modifiers] )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

modifiers: one or more key modifiers (read for details)

Performs a mouse double click on the click point using the left button.

returns: the number of performed double clicks (actually 1). A 0 (integer null) means that because of some reason, no click could be performed. This would be the case, if using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffect: when using PS, the match can be accessed using getLastMatch() afterwards

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


rightClick( PSMRL, [modifiers] )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

modifiers: one or more key modifiers (read for details)

Performs a mouse click on the click point using the right button.

returns: the number of performed clicks (actually 1). A 0 (integer null) means that because of some reason, no click could be performed. This would be the case, if using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffect: when using PS, the match can be accessed using getLastMatch() afterwards

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


highlight( [seconds] )

seconds: a decimal number taken as a duration in seconds

The region is highlighted showing a red colored frame around it. If the parameter seconds is given, the script is suspended for the specified time. If no time is given, the highlighting is started and the script continues. When later on the same highlight call without a parameter is made, the highlighting is stopped (behaves like a toggling switch).

m = find(some_image)

# the red frame will blink for about 7 - 8 seconds
for i in range(5):
        m.highlight(1)
        wait(0.5)

# a second red frame will blink as an overlay to the first one
m.highlight()
for i in range(5):
        m.highlight(1)
        wait(0.5)
m.highlight()

# the red frame will grow 5 times
for i in range(5):
        m.highlight(1)
        m = m.nearby(20)

The red frame is just an overlay in front of all other screen content and stays in its place, independently from the behavior of this other content, which means it is not "connected" to the defining region.

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


hover( PSMRL )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

Moves the mouse pointer to the click point and does nothing else.

returns: the number 1 if the mousepointer could be moved to the click point. If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffect: when using PS, the match can be accessed using getLastMatch() afterwards

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


dragDrop( PSMRL, PSMRL, [modifiers] )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

modifiers: one or more key modifiers (read for details)

A complete drag-and-drop operation is performed. The first PSMRL is the starting click point, the second PSMRL the target clickpoint.

Sideeffect: when using PS as the second parameter or both, the target match can be accessed using getLastMatch() afterwards. If only the first parameter is given as PS, this match is returned by getLastMatch().

Having problems with dragDrop?
When the operation does not perform as expected (usually caused by timing problems due to delayed reactions of applications), you may adjust the internal timing parameters Settings.!DelayAfterDrag and Settings.!DelayBeforeDrop eventually combined with the internal timing parameter Settings.!MoveMouseDelay.
Another solution might be, to use a combination of drag() and dropAt() combined with your own wait()'s.
If the mouse movement from source to target is the problem, you might break up the move path into short steps using mouseMove(nextLocation).

Note: If you need to implement more sophisticated mouse and keyboard actions look at Low Level Mouse and Keyboard Actions.

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


drag( PSMRL )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

The mousepointer is moved to the click point and the left mouse button is pressed and held, until another mouse action is performed (e.g. a dropAt() afterwards). This is nomally used to start a drag-and-drop operation.

returns: the number 1 if the operation could be performed. If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffect: when using PS, the match can be accessed using getLastMatch() afterwards

Note: If you need to implement more sophisticated mouse and keyboard actions look at Low Level Mouse and Keyboard Actions.

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


dropAt( PSMRL, [delay] )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

delay: a number, which can have a fraction, as waiting time in seconds. The internal granularity is milliseconds.

The mousepointer is moved to the click point. After waiting for delay seconds the left mouse button is released. This is normally used to finalize a drag-and-drop operation. If it is necessary to visit one ore more click points after dragging and before dropping, you can use mouseMove() inbetween.

returns: the number 1 if the operation could be performed. If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffect: when using PS, the match can be accessed using getLastMatch() afterwards

Note: If you need to implement more sophisticated mouse and keyboard actions look at Low Level Mouse and Keyboard Actions.

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


type( [PSMRL], text, [modifiers] )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

text: a string containing the characters, that should be typed from left to right

modifiers: one or more key modifiers (read for details)

Simulates keyboard typing interpreting the characters of text based on the layout/keymap of the standard US keyboard (QWERTY). Special keys (ENTER, TAB, BACKSPACE, ...) can be incorporated into text by using the constants defined in class Key using the standard string concatenation (+). If PSMRL is given, a click on the clickpoint is performed before typing, to gain the focus. (Mac: it my be necessary, to use switchApp() before, to prepare the application for accepting the click)

If PSMRL is omitted, it performs the typing on the current focused visual component (normally an input field or an menu entry that can be selected by typing something). returns: the number 1 if the operation could be performed, otherwise 0 (integer null). If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffects: when using PS, the match can be accessed using getLastMatch() afterwards

Note: If you need to type international characters or you are using layouts/keymaps other than US-QWERTY, you should use paste() insteda. Since type() is rather slow, also use paste() to type longer texts.

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


paste( [PSMRL], text )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

text': a string that should be pasted into the target (e.g. input field)

Pastes text using the clipboard with OS-level shortcut (Ctrl-V or Cmd-V). So afterwards your clipboard contains text. paste() is a temporary solution for typing international characters or typing on keyboard layouts other than US-QWERTY.

If PSMRL is given, a click on the clickpoint is performed before typing, to gain the focus. (Mac: it my be necessary, to use switchApp() before, to prepare the application for accepting the click)

If PSMRL is omitted, it performs the paste on the current focused component (normally an input field).

returns: the number 1 if the operation could be performed, otherwise 0 (integer null). If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffects: when using PS, the match can be accessed using getLastMatch() afterwards

NOTE: Special keys (ENTER, TAB, BACKSPACE, ...) cannot be used with paste(). If needed, you have to split your complete text into two or more paste()'s and use type() for typing the special keys inbetween. Characters like \n (enter/new line) and \t (tab) should work as expected with paste().

[ Acting on a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.7. Extracting Text from a Region

text

[ Navigator ] - [ top of Document ]


text()

returns: the text as a string, that was extracted from the region with the Sikuli X OCR-feature.

[ Extracting Text from a Region ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.8. Low Level Mouse and Keyboard Actions

getMouseLocation - keyDown - keyUp mouseDown - mouseMove - mouseUp - wheel

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

button: when used as parameter in mouse actions, it has to be one of the constants Button.LEFT, Button.MIDDLE, Button.RIGHT or a combination using the + or | operator.

Note: In case of having more than one Monitor active, read Multi Monitor Environments before.

[ Navigator ] - [ top of Document ]


mouseDown( button )

button: it has to be one of the constants Button.LEFT, Button.MIDDLE, Button.RIGHT denoting the button to be used or a combination.

The mouse button(s) button is(are) pressed and held until another mouse action is performed.

returns: the number 1 if the operation could be performed otherwise 0 (integer null).

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


mouseUp( [button] )

button: it has to be one of the constants Button.LEFT, Button.MIDDLE, Button.RIGHT denoting the button to be used or a combination.

The mouse button(s) button is(are) released. If button is omitted, all currently pressed buttons are released.

returns: the number 1 if the operation could be performed otherwise 0 (integer null).

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


mouseMove( PSRML )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

The mouse pointer is moved to the target pixel, that is evaluated from PSMRL.

returns: the number 1 if the operation could be performed. If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffects:

  • lastMatch: when using PS, the match can be accessed using getLastMatch() afterwards

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


wheel( PSRML, WHEEL_DOWN | WHEEL_UP, steps )

PSMRL: a pattern, a string, a match, a region or a location that evaluates to a click point (read for details)

WHEEL_DOWN | WHEEL_UP: one of these two constants to denote the wheeling direction

steps: an integer number specifying the amount of wheeling

The mouse pointer is moved to the target pixel, that is evaluated from PSMRL. Then the mouse wheel is turned the number of given steps in the given direction.

returns: the number 1 if the operation could be performed. If using PS (yields an implicit find), the find fails and you have switched of exception FindFailed to be raised, a 0 (integer null) is returned. Otherwise the script is stopped with a FindFailed exception (read details).

Sideeffects:

  • lastMatch: when using PS, the match can be accessed using getLastMatch() afterwards

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


getMouseLocation()

Usage: Env.getMouseLocation() (a static method of Class Env).

returns: the current position of the mouse cursor as an object of Class Location.

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


keyDown( key | list of keys )

key | list of keys: one or more keys (use the constants of class Key). A list of keys is a concatenation using "+" of some constants.

The given keys are pressed and held until released by a later keyUp().

returns: the number 1 if the operation could be performed otherwise 0 (integer null).

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


keyUp( [ key | list of keys ] )

[ key | list of keys ]: one or more keys (use the constants of class Key). A list of keys is a concatenation using "+" of some constants.

The given keys are released. If no key is given, all currently pressed keys are released.

returns: the number 1 if the operation could be performed otherwise 0 (integer null).

[ Low Level Mouse and Keyboard Actions ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.9. Exception FindFailed

As a default, find operations (explicit and implicit) when not successful raise an exception FindFailed, that will stop the script immediately. This is of great help, when developing a script, to step by step adjust timing and similarity. When the script runs perfectly, then an exception FindFailed signals, that something is not as it should be.

To implement some checkpoints, where you want to asure your workflow, use exists(), that reports a not found situation without raising FindFailed (returns False instead).

To run all or only parts of your script without FindFailed exceptions to be raised, use setThrowException() to switch it on and off as needed.

For more sophisticated concepts, you can implement your own exception handling using the standard Python construct try: except:.

Example: 3 solutions for a case, where you want to decide how to proceed in a workflow based on the fact that a specific image can be found. (pass is the python statement, that does nothing, but maintains indentation to form the blocks)

# --- nice and easy
if exists("path-to-image"): # no exception, returns None when not found
        pass # it is there
else:
        pass # we miss it

# --- using exception handling
# every not found in the try block will switch to the except block
try:
        find("path-to-image")
        pass # it is there
except FindFailed:
        pass # we miss it

# --- using setThrowException
setThrowException(False) # no exception raised, not found returns None
if find("path-to-image"):
        setThrowException(True) # reset to default
        pass # it is there
else:
        setThrowException(True) # reset to default
        pass # we miss it

[ Navigator ] - [ top of Document ]


setThrowException( False | True ) / getThrowException()

  • setThrowException( False | True )

By using this method you control, how Sikuli should handle not found situations. If used without specifying a region, the default/primary screen (default region SCREEN) is used.

True: from now on, unsuccessful find operations (explicit or implicit) will raise exception FindFailed (the default when a script is started).

False: from now on, unsuccessful find operations will not raise exception FindFailed. Find operations will return None, actions like click() will do nothing and mostly return 0.

  • getThrowException()

returns: the current setting as True or False (after start of script, this is True by default). If used without specifying a region, the default/primary screen (default region SCREEN) is used.

[ Exception FindFailed ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


5.10. Grouping Method Calls (with Region:)

instead of

# reg is a region object
if not reg.exists(image1):
        reg.click(image2)
reg.wait(image3, 10)
reg.doubleClick(image4)

you can say

# reg is a region object
with reg:
        if not exists(image1):
                click(image2)
        wait(image3, 10)
        doubleClick(image4)

All methods inside the with block that have the region omitted are redirected to the region object specified at the with statement.

[ Grouping Method Calls ] - [ Methods of Region ] - [ Navigator ] - [ top of Document ]


6. Class Screen


Table of Contents


Class Screen is there, to have a representation for a pysical monitor where the capturing process (grabbing a rectangle from a screenshot, to be used for further processing with find operations is implemented. For multi monitor environments it contains features to map to the relevant monitor.

Since Screen is of class Region, grouping method calls can be used with a screen object. Normally this should only be relevant in Multi Monitor Environments, to use it for other screens, than the default/primary screen, where you have this feature by default. Be aware, that using the whole screen for find operations may have an impact on performance. So if possible either use setROI() or restrict a find opeation to a smaller region object (e.g. reg.find()) to speed up processing.

[ Navigator ] - [ top of Document ]


6.1. Methods of Screen

Since class Screen extends class Region, all methods of class Region can be used with an existing screen object.

As a convenience here you have the most useful ones:

click - drag - dragDrop - dropAt - doubleClick - exists - find - findAll - getCenter - getH - getLastMatch - getLastMatches - getROI - getW - getX - getY - hover - keyDown - keyUp - mouseDown - mouseMove - mouseUp - observe - onAppear - onVanish - onChange - paste - rightClick - setAutoWaitTimeout - setH - setROI - setThrowException - setW - setX - setY - stopObserver - text - type - wait - waitVanish

additionally:

capture - getBounds - getNumberScreens - selectRegion

[ Navigator ] - [ top of Document ]


6.2. Screen: Setting, Getting Attributes and Information

getBounds - getNumberScreens

Screen(), Screen( id )

id: an integer number

It creates a new screen object, that represents the default/primary monitor (whose id is 0), if id is omitted. Numbers 1 and higher represent additional monitors that are available at the time, the script is running (read for details).

Using numbers, that do not represent an existing monitor, will stop the script with an error. So you may either use getNumberScreens() or exception handling, to avoid this.

returns: a new screen object.

Note: If you want to access the default/primary monitor ( Screen(0) ) without creating a new screen object, use the constant reference SCREEN, that is initiated when your script starts: SCREEN=Screen(0).

[ Screen: Setting and Getting Attributes ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


getNumberScreens()

returns: the number of screens available at the time the script is running (read for details).

[ Screen: Setting and Getting Attributes ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


getBounds()

returns: a rectangle object, where width and height denote the dimensions of the screen object representing a monitor. This attribute is taken from the operating system and cannot be changed using Sikuli script.

Note: In case of having more than one Monitor active, read Multi Monitor Environments for more information.

[ Screen: Setting and Getting Attributes ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


6.3. Screen as (Default) Region

Normally all region methods are used as reg.find(PS), where reg is a region object (or a screen or a match object). If written as find(PS) it acts on the default screen being the implicit region in this case (mapped to the constant reference SCREEN). In Multi Monitor Environments this is the primary monitor (use the constant reference SCREEN, to access it all the time), that normally is Screen(0), but might be another Screen() object depending on your platform.

So its a convenience feature, that can be seen as an implicit use of the python construct '''with object:'''.

On the other hand this may slow down processing speed, because of time consuming searches. So to speed up processing, saying region.find() will restrict the search to the specified rectangle. Another possibility is to say setROI() to restrict the search for all following find operations to a smaller region than the whole screen. This will speed up processing, if the region is significantly smaller than the whole screen.

[ Screen as (Default) Region ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


6.4. Capturing

capture - selectRegion

Capturing is the feature, that allows to grab a rectangle from a screenshot, to save it for later use. At each time, a capturing is initiated, a new screenshot is taken.

There are two different versions: the first one capture() saves the content of the selected rectangle in a file and returns its file name, whereas the second one selectRegion() just returns the position and dimension of the selected rectangle.

Both features are available in the IDE via buttons.


capture( [region | rectangle | text] ), capture( x, y, w, h )

  • no parameter: enters the interactiv mode
  • text: enters the interactiv mode and text is displayed for about 2 seconds in the middle of the screen
  • region: an existing region object
  • rectangle: an existing rectangle object (e.g. as a return value of another region method)
  • x, y, w, h: position (x,y) and dimension width and height of the rectangle to capture

Interactive mode: The script enters the screen-capture mode like when clicking the button in the IDE, enabling the user to capture a rectangle on the screen. If no text is given, the default "Select a region on the screen" is displayed.

If any arguments other than text are specified, capture() automatically captures the given rectangle of the screen. In any case, a new screenshot is taken, the content of the selected rectangle is saved in a temporary file. The file name is returned and can be used later in the script as a reference to this image. It can be used directly in cases, where a parameter PS is allowed (e.g. find(), click(), ...).

returns: the path to the file, where the captured image was saved. In interactive mode, the user may cancel the capturing, in which case None is returned.

[ Capturing ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


selectRegion()

Enables the user of a script to select a rectangle on the screen. For a description read here.

returns: a new region object or None, if the user cancels the capturing process.

[ Capturing ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


6.5. Multi Monitor Environments

If more than one monitor is available, Sikuli is able to manage regions and click points on these monitors.

The base is the coordinate system (picture above), that positions the primary monitor with its upper left corner at (0,0) extending the x-direction to the right and the y-direction towards the lower boundary of the screen. The position of additional monitors can be configured in the operating system to be on either side of the primary monitor, with different positions and sizes. So monitors left of the primary will have pixels with negative x-values and monitors above will have negative y-values (left and above both x and y are negative).

At script start, Sikuli gets the relevant information from the operating system and creates respective screen objects, that have an ID (0 for the first or primary monitor, 1 and higher for additional monitors with a maximum of one less than the number of screens) and know the rectangle, they cover in the coordinate system. These informations are readonly for a script.

These predefined screen objects can be accessed with Screen(0), Screen(1), ... and are normally used to create your own screen objects. The possibility to use the region methods on a default region mapped to the primary monitor is implemented with the constant reference SCREEN. This concept is only available for the primary monitor.

How to get the relevant information:

  • getNumberScreens() returns the number of available screens
  • getBounds() returns the rectangle, that is covered by the default/primary monitor
  • scr.getBounds() returns the rectangle, that is covered by the screen object scr, that was created before using Screen( id ).

You should analyse the information returned by this methods, to evaluate the specific situation with your monitor configuration.

Be aware: Changes in your system settings are only recognized by the IDE, when it is started.

Windows: The monitor, that is the first one based on hardware mapping (e.g. the laptop monitor), will always be Screen(0). In the Windows settings it is possible to place the taskbar on one of the secondary monitors, which makes it the primary monitor getting the base coordinates (0,0). The other available monitors are mapped around based on your settings. But the mapping is not changed, so the primary monitor might be any of your Screen() objects. Sikuli takes care for that and maps SCREEN always to the primary monitor (the one with the (0,0) coordinates).
So for example you have a laptop with an external monitor, that shows the taskbar (is primary monitor):
- SCREEN maps to Screen(1)
- Screen(0) is your laptop monitor

Mac: The monitor, that has the System Menu Bar, is always Screen(0) and mapped to the default SCREEN.

Linux: under evaluation

So with its covered rectangle, a screen object is always identical with the monitor it was created based on Screen( id). Using setROI() to restrict the region of interest for find operations has no effect on the base rectangle of the screen object.

On the other hand region objects and location objects can be positioned anywhere in the coordinate system. Only when a find operation or a click action has to be performed, the objects rectangle or point has to be inside the rectangle of an existing monitor (basically repersented by Screen(0), Screen(1), ...). When this condition is met, everything works as expected and known from a single monitor system.

With finding and acting there are the following exceptions:

  • Point Outside: a click point is outside any monitor rectangle. The clickpoint will be mapped to the edges or corners of the primary monitor according to the relative position:
    • to the edges if its x or y value is in the range of the respective edge (right, left, above, below)
    • to the corners, if x and y are outside any range of any edge (left/above -> upper left corner, ...)
  • Region Outside: a region is completely outside any Monitor
    • a click action is handled according Point Outside
    • a find operation will always fail
  • Region Partly Outside: a region is partly outside of one monitor, no overlap on another monitor
    • a click action is handled according Point Outside, if this condition is met, otherwise it is processed
    • for a find operation, the region will be reduced to the part totally contained in the monitor rectangle
  • Region Overlaps Monitors: a region overlaps two or more monitors
    • a click action is handled according Point Outside, if this condition is met, otherwise it is processed
    • for a find operation, the region will be reduced to the part totally contained in the monitor rectangle belonging to the screen with the smallest id (so with 2 monitors this is always the primary monitor), so the find may fail, because a possible match is partly or totally outside the selected monitor.

An interactive capture (the user selects an image / a rectangle - capture() - selectRegion()) will automatically be restricted to the monitor rectangle, where it was started.

A scripted capture using a rectangle or a region, will be handled according to the exceptions above.

  • Region Outside no image captured, None is returned
  • Region Partly Outside the returned image will only cover the part contained in the monitor rectangle
  • Region Overlaps Monitors the returned image will only cover the part contained in the monitor rectangle of the monitor with the smallest id.

Based on the knowledge of your monitor configuration, you can now start some further evaluations using e.g. hover() together with setShowActions(True) and region highlight.

[ Multi Monitor Environments ] - [ Methods of Screen ] - [ Navigator ] - [ top of Document ]


7. Class Location

Table of Contents

This class is there as a convenience, to handle single points on the screen directly by its position (x, y). It iss mainly used in the actions on a region, to directly denote the click point. It contains methods, to "move" a point around on the screen.

[ Navigator ] - [ top of Document ]


7.1. Methods of Location

above - below - getsetL - getsetL - left - offset - right - getsetL - getsetL

[ Navigator ] - [ top of Document ]


7.2. Creating a Location, Setting and Getting Attributes

Location( x, y )

returns: a new location object, representing the position (x, y) on the screen.

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


getX(), getY(), setX( number ), setY( number )

getX(), getY(): the x or y value are returned (saying location.x or location.y is equivalent).

setX( number ), setY( number ): the respective attribute is set to number (saying location.x = number or location.y = number is equivalent).

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


offset( dx, dy )

returns: a new Location(x+dx, y+dy)

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


above( dy )

returns: a new Location(x, y-dy)

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


below( dy )

belowL

returns: a new Location(x, y+dy)

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


left( dx )

returns: a new Location(x-dx, y)

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


right( dx )

returns: a new Location(x+dx, y)

[ Creating a Location, Setting and Getting Attributes ] - [ Methods of Location ] - [ Navigator ] - [ top of Document ]


8. Class Match

Table of Contents

An object of class Match represents the result of a successful find operation. It has the rectangle dimension of the image, that was used to search. It knows the point of its upper left corner on an existing monitor, where it was found. You can act on it using the applicable methods of Class Region.

[ Navigator ] - [ top of Document ]


8.1. Methods of Match

Since class Match extends class Region, all methods of class Region can be used with a match object.

As a convenience here you have the most useful ones:

above - below - click - drag - dragDrop - dropAt - doubleClick - exists - find - findAll - getCenter - getH - getLastMatch - getLastMatches - getScreen - getW - getX - getY - hover - inside - left - mouseMove - nearby - observe - onAppear - onVanish - onChange - paste - right - rightClick - selectRegion - setAutoWaitTimeout - setThrowException - stopObserver - type - wait - waitVanish

additionally available:

getScore - getTarget

[ Navigator ] - [ top of Document ]


8.2. Creating a Match, Getting Attributes

A match object is created as the result of an explicit find operation. It can be saved in a variable for later use with actions like click().

It has the rectangle dimension of the image, that was used to search. It knows the point of its upper left corner on an existing monitor, where it was found. It knows the similarity it was found with and a click point to be used, if set by a pattern.

m = find("path-to-image-file") # m is a reference to a match object, if found
print m # message area: Match[833,248 447x141] score=0.89, target=null

m = find(Pattern("path-to-image-file").similar(0.5).targetOffset(100,0)) # m is a reference to a match object, if found
print m # message area: Match[406,461 140x123] score=0.75, target=(576,522)

For all other aspects the features and attributes of class Region apply.

[ Creating a Match, Getting Attributes ] - [ Methods of Match ] - [ Navigator ] - [ top of Document ]


getScore()

returns: the similarity that the image/pattern was found with as a decimal value: 0 < value <= 1

[ Creating a Match, Getting Attributes ] - [ Methods of Match ] - [ Navigator ] - [ top of Document ]


getTarget()

returns: a location object, that will be used as click point with actions like click(). If the find operation was based on a pattern without using targetOffset() or an image, this will be the same as getCenter(). Otherwise it will be calculated and returned as getCenter().offset(targetX, targetY), where targetX and targetY are the respective values of targetOffset() of the pattern used for the find operation.

[ Creating a Match, Getting Attributes ] - [ Methods of Match ] - [ Navigator ] - [ top of Document ]


8.3. Iterating over Matches after findAll()

A find operation findAll() returns an iterator object that contains all matches found. It can be stepped through to get each match object in turn. Remember, that a reference to the iterator of matches is stored in the respective region and can be accessed using getLastMatches().

Important to know:

  • per definition, an iterator can be stepped through only once - it is empty afterwards
  • it has to be destroyed manually using iteratorMatches.destroy(), when used with for: or while:
  • when used in a with: construct, it is destroyed automatically

With class Finder you can read more about the basics of an iterator. There you find an example, how to save the contained matches for later use.

The methods to use:

  • hasNext(): returns True, if there is at least one match left, otherwise False
  • next(): returns the next match, if there is at least one match left, otherwise None
  • destroy(): destroys the iterator object (releases memory)

Example: using while: with default screen

findAll("path-to-an-imagefile") # find all matches
mm = SCREEN.getLastMatches()
while mm.hasNext(): # loop as long there is a first and more matches
        print "found: ",  mm.next() # access the next match in the row

print mm.hasNext() # is False, because mm is empty now
print mm.next() # is None, because mm is empty now
print SCREEN.getLastMatches().hasNext() # is False also ;-)
mm.destroy() # to save memory

Example: using with: with default screen

with findAll("path-to-an-imagefile") as mm:
        while mm.hasNext(): # loop as long there is a first and more matches
                print "found: ",  mm.next() # access the next match in the row
# mm will be None afterwards (destroyed automatically)

[ Iterating over Matches ] - [ Methods of Match ] - [ Navigator ] - [ top of Document ]


9. Class Finder

Table of Contents

A Finder object is based on an iterator of matches and allows to search for a visual object in an image file that you provide (e.g. a screenshot taken and saved in a file before). After setting up the finder object and doing a find operation, you can iterate through the found matches if any.

Important to know:

  • per definition, an iterator can be stepped through only once - it is empty afterwards
  • it has to be destroyed manually using iteratorMatches.destroy(), especially when used with for: or while:
  • when used in a with: construct, it is destroyed automatically

Compared with the region based find operation findAll(), no exception FindFailed is raised in case of nothing found at all (use haseNext() to check) and what region.lastMatches is with findAll() is the finder object itself here.

Note: With this version, there is no chance, to get the number of matches in advance. If you would iterate through to count, afterwards your finder would be empty. So in this case, you have to save your matches somehow (one possible solution see example below).

[ Navigator ] - [ top of Document ]


9.1. Methods of Finder

find - hasNext - next

[ Navigator ] - [ top of Document ]


9.2. Creating a Finder

Finder( path-to-imagefile )

path-to-imagefile: just what it says (e.g. your screenshot)

First you have to create a new finder object.

9.3. Using a Finder

The workflow always is, that first do a find operation and afterwards go through the matches found. After a complete iteration, the finder object is empty again. So you could start a new find operation again.

find( path-to-imagefile, [ similarity ] )

path-to-imagefile: just what it says (your image to search for)

[ similarity ]: the minimum similarity a match should have, if omitted, the default is used (read for details)

Searches for the given image with reference to the minimum similarity and stores the matches in the finder object.

hasNext()

Checks if there are more matches available. So when used the first time after a find(), it tells you wether there are matches at all.

returns: True if there are more matches available, otherwise False

next()

Accesses the next match in the row. The returned reference to a match object is no longer available in the finder object afterwards. So if you need it later on, you have to save it somehow.

returns: a match object


Example: just the basics

f = Finder("path-to-an-imagefile") # create a Finder with your saved screenshot
img= "path-to-an-imagefile" # the image you are searching
f.find(img) # find all matches
while f.hasNext(): # loop as long there is a first and more matches
        print "found: ",  f.next() # access the next match in the row
print f.hasNext() # is False, because f is empty now
f.destroy() # to save memory (especially in loops when new objects are created all the time)

Example: we want to know the number of matches before

f = Finder("path-to-an-imagefile") # create a Finder with your saved screenshot
img = "path-to-an-imagefile" # the image you are searching
f.find(img) # find all matches
mm = [] # an empty list
while f.hasNext(): # loop as long there is a first and more matches
        mm.append( f.next() ) # access next match and add to mm
print f.hasNext() # is False, because f is empty now
# now we have our matches saved in list mm
print len( mm ) # the number of matches
# we want to use our matches
for m in mm:
        print m # or what ever you want
f.destroy() # to save memory (especially in loops when new objects are created all the time)

[ Navigator ] - [ top of Document ]


10. Class Pattern

A pattern is used, to associate an image file with additional attributes used in find operations and when acting on a match object.

Minimum similarity: When using just an image file in a find operation, the search will be successful and return a match object, if the similarity of a possible match is 0.7 or higher. With a pattern object, you can associate a specific similarity value, that will be used as the minimum value, when this pattern object is searched (similar(), exact()). The IDE supports adjusting the minimum similarity with captured images (internally in the script, the images are turned into a pattern object definition)

click point: normally when clicking on a match, the center pixel of the associated rectangle is used. With a pattern object, you can define a different click point (targetOffset()).

[ Navigator ] - [ top of Document ]


10.1. Methods of Pattern

exact - getFilename - getTargetOffset - similar - targetOffset

[ Navigator ] - [ top of Document ]


10.2. Creating a Pattern, Setting and Getting Attributes

Pattern( string )

string: a path to an image file

This will initialize a new pattern object without any additional attributes. As long as no pattern methods are used additionally, it is the same as just using the image file name itself in the find operation.

returns: a new pattern object

similar( similarity )

similarity: the minimum similarity to use in a find operation as a decimal value: 0 < similarity <= 1

A new Pattern object is created containing the same attributes (image, click point) with the minimum similarity set to the specified value.

returns: a new pattern object

exact()

A new Pattern object is created containing the same attributes (image, click point) with the minimum similarity set to 1.0, which means exact match required.

returns: a new pattern object

targetOffset( dx, dy)

dx, dy: numbers without a fraction that can be negativ.

A new Pattern object is created containing the same attributes (image, similarity), but a different definition for the click point to be used, when acting on the match object returned by a successful find operation. It will be calculated as match.getCenter().offset(dx, dy) (details)

returns: a new pattern object

[ Creating a Pattern, Setting and Getting Attributes ] - [ Methods of Pattern ] - [ Navigator ] - [ top of Document ]

getFilename()

returns: the filename of the image, that is contained in the pattern object (as a string)

[ Creating a Pattern, Setting and Getting Attributes ] - [ Methods of Pattern ] - [ Navigator ] - [ top of Document ]

getTargetOffset()

returns: the currently specified target offset of the pattern object (as a location object)

[ Creating a Pattern, Setting and Getting Attributes ] - [ Methods of Pattern ] - [ Navigator ] - [ top of Document ]


11. Class VDict

Table of Contents


VDict implements a visual dictionary that has Python's conventional dictionary interfaces and whose syntax is modeled after that (thereafter called VDict)

Using a VDict a user can easily automate the tasks of saving and retrieving arbitrary data objects by using images as keys. The syntax of a VDict is modeled after that of the built-in Python dictionary data type. But be aware, that only operators and built-in methods of a Python dictionary are supported, when mentioned here.

Internally, a VDict is based on a dictionary with keys representing images using a path to an image file (so it is a string). The values, that are paired with the keys (together they form an item), can be any data type allowed in the Sikuli environment based on the Python language. To locate an item in a VDict, we use the same features as when searching for images in a region (so to match a key value, not the string representing a filename is used to compare, but the visual content applying a given similarity). If not yet familiar with, read for details beginning with the class Pattern.

When using a VDict, the images referenced by the keys must be available (read: bundel-path and path to an image file)

[ Navigator ] - [ top of Document ]


11.1. Methods/Operators/Constants of VDict

vdict[ ] - vdict[ ] = - [ not in vdict] - del vdict[ ] - get -get1 - get_exact - len( ) - keys

11.2. Setting it up and getting Information

VDict( [ vdict | dict ] )

[ vdict | dict ]: another existing VDict named vdict in this case or a basic python dictionary named dict in this case, whose keys are strings containing image filenames, that can be located at runtime. If omitted, the new VDict is empty.

Constructs a new visual dictionary with the same mapping as the given dict.

returns: the new VDict

len( vdict )

vdict: an exisiting VDict named vdict in this case

returns: the number of keys in this visual dictionary.

keys()

returns: a list containing the keys of the VDict.

11.3. Managing Items

vdict[ path-to-an-imagefile ] = value

vdict: the name of an existing VDict object

path-to-imagefile: just what it says (your image to search for in the keys)

value: an arbitrary value

A new item is created, mappping the specified image to the specified value.

del vdict[ path-to-an-imagefile ]

vdict: the name of an existing VDict object

path-to-imagefile: just what it says (your image to search for in the keys)

Deletes the key and its corresponding value from this VDict.

11.4. Searching for specific Items

path-to-an-imagefile [not] in vdict

vdict: the name of an existing VDict object

path-to-imagefile: just what it says (your image to search for in the keys)

Tests if the image matches with at least one of the keys using the default similarity.

returns: True if at least one matches, False otherwise

vdict[ path-to-an-imagefile ]

vdict: the name of an existing VDict object

path-to-imagefile: just what it says (your image to search for in the keys)

returns: a list of all values to which the specified image matches with the default similarity.

get_exact( path-to-an-imagefile )

path-to-imagefile: just what it says (your image to search for in the keys)

returns: the one value to which the specified key is exactly matched.

get1( path-to-an-imagefile', similarity)

path-to-imagefile: just what it says (your image to search for in the keys)

returns: the one value to which the specified image is best matched with the given similarity.

get( path-to-an-imagefile', similarity, number )

path-to-imagefile: just what it says (your image to search for in the keys)

similarity: the similarity for matching (a decimal number between 0 and 1) number: maximum number of returned items.

returns: a list of the values, whose length can be limited by number, to which the specified image matches with the given similarity used as keys in the VDict.

11.5. Constants of VDict

_DEFAULT_SIMILARITY

the default similarity for fuzzy matching. The range of this is from 0 to 1.0, where 0 matches everything and 1.0 does exactly matching. The default similarity is 0.7.

[ Navigator ] - [ top of Document ]


12. Key Constants

Table of Contents

Applicable usage situations for these predefined constants of special keys and key modifiers can be found in Acting on a Region and Low Level Mouse and Keyboard Actions.

[ Navigator ] - [ top of Document ]


12.1. Key Modifiers

Methods where key modifiers can be used:

click - dragDrop - doubleClick - rightClick - type

  • the oldies but goldies

KEY_ALT - KEY_CTRL - KEY_SHIFT

  • system specific Win/Mac

KEY_WIN ( Windows key ) - KEY_CMD (The Apple command key) - KEY_META (a synonym for KEY_WIN or KEY_CMD on the respective system)

Note: These constants are mapped to the according constants of the Java environment in the class java.awt.event.InputEvent. They should only be used only as the modifiers parameter in functions like type(), click(), etc.
They should never be used with keyDown() and keyUp().

[ Key Modifiers ] - [ Key Constants ] - [ Navigator ] - [ top of Document ]


12.2. Special Keys

The methods supporting the use of special keys are type(), keyDown(), and keyUp().

Usage: Key.CONSTANT (where CONSTANT is one of the following key names). Concatenation with strings with "+" can be used.

  • miscellanous keys

ENTER - TAB - ESC - BACKSPACE - DELETE - INSERT

  • function keys

F1 - F2 - F3 - F4 - F5 - F6 - F7 - F8 - F9 - F10 - F11 - F12 - F13 - F14 - F15

  • navigation keys

HOME - END - LEFT - RIGHT - DOWN - UP - PAGE_DOWN - PAGE_UP

  • special keys

PRINTSCREEN - PAUSE - CAPS_LOCK - SCROLL_LOCK - NUM_LOCK

  • num pad keys

NUM0 - NUM1 - NUM2 - NUM3 - NUM4 - NUM5 - NUM6 - NUM7 - NUM8 - NUM9
SEPARATOR - ADD - MINUS - MULTIPLY - DIVIDE

  • key modifiers

ALT - CMD - CTRL - META - SHIFT - WIN

Note: The key modifier constants can not be used as the parameter with functions like type(), click(), etc.
They can only be used with keyDown() and keyUp().

[ Special Keys ] - [ Key Constants ] - [ Navigator ] - [ top of Document ]


13. Class Env

Though formally defined in this class, the description of the following methods is found in the applicable context:

getOS() returns information about the system, the script is running on ( -> General Information and Settings )

getOSVersion() returns the system's version number, the script is running on ( -> General Information and Settings )

getMouseLocation() returns the current location of the mouse pointer ( -> Low Level Mouse and Keyboard Actions )

getClipboard() returns the content of the Clipboard if it is text, otherwise an empty string. ( -> General Information and Settings )

[ Navigator ] - [ top of Document ]


Attachments