Full Syntax
This section discusses the query language syntax, as defined in the Java Persistence specification. Much of the following material paraphrases or directly quotes the specification.
BNF Symbols
Table 27-1 describes the BNF symbols used in this chapter.
BNF Grammar of the Java Persistence Query Language
Here is the entire BNF diagram for the query language:
QL_statement ::= select_statement | update_statement | delete_statement select_statement ::= select_clause from_clause [where_clause] [groupby_clause][having_clause] [orderby_clause] update_statement ::= update_clause [where_clause] delete_statement ::= delete_clause [where_clause] from_clause ::= FROM identification_variable_declaration {, {identification_variable_declaration | collection_member_declaration}}* identification_variable_declaration ::= range_variable_declaration { join | fetch_join }* range_variable_declaration ::= abstract_schema_name [AS] identification_variable join ::= join_spec join_association_path_expression [AS] identification_variable fetch_join ::= join_specFETCH join_association_path_expression association_path_expression ::= collection_valued_path_expression | single_valued_association_path_expression join_spec::= [LEFT [OUTER] |INNER] JOIN join_association_path_expression ::= join_collection_valued_path_expression | join_single_valued_association_path_expression join_collection_valued_path_expression::= identification_variable.collection_valued_association_field join_single_valued_association_path_expression::= identification_variable.single_valued_association_field collection_member_declaration ::= IN (collection_valued_path_expression) [AS] identification_variable single_valued_path_expression ::= state_field_path_expression | single_valued_association_path_expression state_field_path_expression ::= {identification_variable | single_valued_association_path_expression}.state_field single_valued_association_path_expression ::= identification_variable.{single_valued_association_field.}* single_valued_association_field collection_valued_path_expression ::= identification_variable.{single_valued_association_field.}* collection_valued_association_field state_field ::= {embedded_class_state_field.}*simple_state_field update_clause ::=UPDATE abstract_schema_name [[AS] identification_variable] SET update_item {, update_item}* update_item ::= [identification_variable.]{state_field | single_valued_association_field} = new_value new_value ::= simple_arithmetic_expression | string_primary | datetime_primary | boolean_primary | enum_primary simple_entity_expression | NULL delete_clause ::= DELETE FROM abstract_schema_name [[AS] identification_variable] select_clause ::= SELECT [DISTINCT] select_expression {, select_expression}* select_expression ::= single_valued_path_expression | aggregate_expression | identification_variable | OBJECT(identification_variable) | constructor_expression constructor_expression ::= NEW constructor_name(constructor_item {, constructor_item}*) constructor_item ::= single_valued_path_expression | aggregate_expression aggregate_expression ::= {AVG |MAX |MIN |SUM} ([DISTINCT] state_field_path_expression) | COUNT ([DISTINCT] identification_variable | state_field_path_expression | single_valued_association_path_expression) where_clause ::= WHERE conditional_expression groupby_clause ::= GROUP BY groupby_item {, groupby_item}* groupby_item ::= single_valued_path_expression having_clause ::= HAVING conditional_expression orderby_clause ::= ORDER BY orderby_item {, orderby_item}* orderby_item ::= state_field_path_expression [ASC |DESC] subquery ::= simple_select_clause subquery_from_clause [where_clause] [groupby_clause] [having_clause] subquery_from_clause ::= FROM subselect_identification_variable_declaration {, subselect_identification_variable_declaration}* subselect_identification_variable_declaration ::= identification_variable_declaration | association_path_expression [AS] identification_variable | collection_member_declaration simple_select_clause ::= SELECT [DISTINCT] simple_select_expression simple_select_expression::= single_valued_path_expression | aggregate_expression | identification_variable conditional_expression ::= conditional_term | conditional_expression OR conditional_term conditional_term ::= conditional_factor | conditional_term AND conditional_factor conditional_factor ::= [NOT] conditional_primary conditional_primary ::= simple_cond_expression |( conditional_expression) simple_cond_expression ::= comparison_expression | between_expression | like_expression | in_expression | null_comparison_expression | empty_collection_comparison_expression | collection_member_expression | exists_expression between_expression ::= arithmetic_expression [NOT] BETWEEN arithmetic_expressionAND arithmetic_expression | string_expression [NOT] BETWEEN string_expression AND string_expression | datetime_expression [NOT] BETWEEN datetime_expression AND datetime_expression in_expression ::= state_field_path_expression [NOT] IN (in_item {, in_item}* | subquery) in_item ::= literal | input_parameter like_expression ::= string_expression [NOT] LIKE pattern_value [ESCAPE escape_character] null_comparison_expression ::= {single_valued_path_expression | input_parameter} IS [NOT] NULL empty_collection_comparison_expression ::= collection_valued_path_expression IS [NOT] EMPTY collection_member_expression ::= entity_expression [NOT] MEMBER [OF] collection_valued_path_expression exists_expression::= [NOT] EXISTS (subquery) all_or_any_expression ::= {ALL |ANY |SOME} (subquery) comparison_expression ::= string_expression comparison_operator {string_expression | all_or_any_expression} | boolean_expression {= |<> } {boolean_expression | all_or_any_expression} | enum_expression {= |<> } {enum_expression | all_or_any_expression} | datetime_expression comparison_operator {datetime_expression | all_or_any_expression} | entity_expression {= |<> } {entity_expression | all_or_any_expression} | arithmetic_expression comparison_operator {arithmetic_expression | all_or_any_expression} comparison_operator ::= = |> |>= |< |<= |<> arithmetic_expression ::= simple_arithmetic_expression | (subquery) simple_arithmetic_expression ::= arithmetic_term | simple_arithmetic_expression {+ |- } arithmetic_term arithmetic_term ::= arithmetic_factor | arithmetic_term {* |/ } arithmetic_factor arithmetic_factor ::= [{+ |- }] arithmetic_primary arithmetic_primary ::= state_field_path_expression | numeric_literal | (simple_arithmetic_expression) | input_parameter | functions_returning_numerics | aggregate_expression string_expression ::= string_primary | (subquery) string_primary ::= state_field_path_expression | string_literal | input_parameter | functions_returning_strings | aggregate_expression datetime_expression ::= datetime_primary | (subquery) datetime_primary ::= state_field_path_expression | input_parameter | functions_returning_datetime | aggregate_expression boolean_expression ::= boolean_primary | (subquery) boolean_primary ::= state_field_path_expression | boolean_literal | input_parameter enum_expression ::= enum_primary | (subquery) enum_primary ::= state_field_path_expression | enum_literal | input_parameter entity_expression ::= single_valued_association_path_expression | simple_entity_expression simple_entity_expression ::= identification_variable | input_parameter functions_returning_numerics::= LENGTH(string_primary) | LOCATE(string_primary, string_primary[, simple_arithmetic_expression]) | ABS(simple_arithmetic_expression) | SQRT(simple_arithmetic_expression) | MOD(simple_arithmetic_expression, simple_arithmetic_expression) | SIZE(collection_valued_path_expression) functions_returning_datetime ::= CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP functions_returning_strings ::= CONCAT(string_primary, string_primary) | SUBSTRING(string_primary, simple_arithmetic_expression, simple_arithmetic_expression)| TRIM([[trim_specification] [trim_character] FROM] string_primary) | LOWER(string_primary) | UPPER(string_primary) trim_specification ::= LEADING | TRAILING | BOTHFROM Clause
The
FROM
clause defines the domain of the query by declaring identification variables.Identifiers
An identifier is a sequence of one or more characters. The first character must be a valid first character (letter,
$
,_
) in an identifier of the Java programming language (hereafter in this chapter called simply "Java"). Each subsequent character in the sequence must be a valid nonfirst character (letter, digit,$
,_
) in a Java identifier. (For details, see the J2SE API documentation of theisJavaIdentifierStart
andisJavaIdentifierPart
methods of theCharacter
class.) The question mark (?
) is a reserved character in the query language and cannot be used in an identifier.A query language identifier is case-sensitive with two exceptions:
An identifier cannot be the same as a query language keyword:
It is not recommended that you use a SQL keyword as an identifier, as the list of keywords may expand to include other reserved SQL words in the future.
Identification Variables
An identification variable is an identifier declared in the
FROM
clause. Although theSELECT
andWHERE
clauses can reference identification variables, they cannot declare them. All identification variables must be declared in theFROM
clause.Because an identification variable is an identifier, it has the same naming conventions and restrictions as an identifier with the exception that an identification variables is case-insensitive. For example, an identification variable cannot be the same as a query language keyword. (See the preceding section for more naming rules.) Also, within a given persistence unit, an identification variable name must not match the name of any entity or abstract schema.
The
FROM
clause can contain multiple declarations, separated by commas. A declaration can reference another identification variable that has been previously declared (to the left). In the followingFROM
clause, the variablet
references the previously declared variablep
:Even if an identification variable is not used in the
WHERE
clause, its declaration can affect the results of the query. For an example, compare the next two queries. The following query returns all players, whether or not they belong to a team:In contrast, because the next query declares the
t
identification variable, it fetches all players that belong to a team:The following query returns the same results as the preceding query, but the
WHERE
clause makes it easier to read:An identification variable always designates a reference to a single value whose type is that of the expression used in the declaration. There are two kinds of declarations: range variable and collection member.
Range Variable Declarations
To declare an identification variable as an abstract schema type, you specify a range variable declaration. In other words, an identification variable can range over the abstract schema type of an entity. In the following example, an identification variable named
p
represents the abstract schema namedPlayer
:A range variable declaration can include the optional
AS
operator:In most cases, to obtain objects a query uses path expressions to navigate through the relationships. But for those objects that cannot be obtained by navigation, you can use a range variable declaration to designate a starting point (or root).
If the query compares multiple values of the same abstract schema type, then the
FROM
clause must declare multiple identification variables for the abstract schema:For a sample of such a query, see Comparison Operators.
Collection Member Declarations
In a one-to-many relationship, the multiple side consists of a collection of entities. An identification variable can represent a member of this collection. To access a collection member, the path expression in the variable's declaration navigates through the relationships in the abstract schema. (For more information on path expressions, see the following section.) Because a path expression can be based on another path expression, the navigation can traverse several relationships. See Traversing Multiple Relationships.
A collection member declaration must include the
IN
operator, but it can omit the optionalAS
operator.In the following example, the entity represented by the abstract schema named
Player
has a relationship field calledteams
. The identification variable calledt
represents a single member of theteams
collection.Joins
The
JOIN
operator is used to traverse over relationships between entities, and is functionally similar to theIN
operator.In the following example, the query joins over the relationship between customers and orders:
The
INNER
keyword is optional:These examples are equivalent to the following query, which uses the
IN
operator:You can also join a single-valued relationship.
A
LEFT JOIN
orLEFT OUTER JOIN
retrieves a set of entities where matching values in the join condition may be absent. TheOUTER
keyword is optional.A
FETCH JOIN
is a join operation that returns associated entities as a side-effect of running the query. In the following example, the query returns a set of departments, and as a side-effect, the associated employees of the departments, even though the employees were not explicitly retrieved by theSELECT
clause.Path Expressions
Path expressions are important constructs in the syntax of the query language, for several reasons. First, they define navigation paths through the relationships in the abstract schema. These path definitions affect both the scope and the results of a query. Second, they can appear in any of the main clauses of a query (
SELECT
,DELETE
,HAVING
,UPDATE
,WHERE
,FROM
,GROUP BY
,ORDER BY
). Finally, although much of the query language is a subset of SQL, path expressions are extensions not found in SQL.Examples
In the following query, the
WHERE
clause contains asingle_valued_path_expression
. Thep
is an identification variable, andsalary
is a persistent field ofPlayer
.The
WHERE
clause of the next example also contains asingle_valued_path_expression
. Thet
is an identification variable,league
is a single-valued relationship field, andsport
is a persistent field ofleague
.In the next query, the
WHERE
clause contains acollection_valued_path_expression
. Thep
is an identification variable, andteams
designates a collection-valued relationship field.Expression Types
The type of a path expression is the type of the object represented by the ending element, which can be one of the following:
For example, the type of the expression
p.salary
isdouble
because the terminating persistent field (salary
) is adouble
.In the expression
p.teams
, the terminating element is a collection-valued relationship field (teams
). This expression's type is a collection of the abstract schema type namedTeam
. BecauseTeam
is the abstract schema name for theTeam
entity, this type maps to the entity. For more information on the type mapping of abstract schemas, see the section Return Types.Navigation
A path expression enables the query to navigate to related entities. The terminating elements of an expression determine whether navigation is allowed. If an expression contains a single-valued relationship field, the navigation can continue to an object that is related to the field. However, an expression cannot navigate beyond a persistent field or a collection-valued relationship field. For example, the expression
p.teams.league.sport
is illegal, becauseteams
is a collection-valued relationship field. To reach thesport
field, theFROM
clause could define an identification variable namedt
for theteams
field:WHERE Clause
The
WHERE
clause specifies a conditional expression that limits the values returned by the query. The query returns all corresponding values in the data store for which the conditional expression isTRUE
. Although usually specified, theWHERE
clause is optional. If theWHERE
clause is omitted, then the query returns all values. The high-level syntax for theWHERE
clause follows:Literals
There are four kinds of literals: string, numeric, Boolean, and enum.
String Literals
A string literal is enclosed in single quotes:
If a string literal contains a single quote, you indicate the quote by using two single quotes:
Like a Java
String
, a string literal in the query language uses the Unicode character encoding.Numeric Literals
There are two types of numeric literals: exact and approximate.
An exact numeric literal is a numeric value without a decimal point, such as 65, -233, and +12. Using the Java integer syntax, exact numeric literals support numbers in the range of a Java
long
.An approximate numeric literal is a numeric value in scientific notation, such as 57., -85.7, and +2.1. Using the syntax of the Java floating-point literal, approximate numeric literals support numbers in the range of a Java
double
.Boolean Literals
A Boolean literal is either
TRUE
orFALSE
. These keywords are not case-sensitive.Enum Literals
The Java Persistence Query Language supports the use of enum literals using the Java enum literal syntax. The enum class name must be specified as fully qualified class name.
Input Parameters
An input parameter can be either a named parameter or a positional parameter.
A named input parameter is designated by a colon (
:
) followed by a string. For example,:name
.An positional input parameter is designated by a question mark (
?
) followed by an integer. For example, the first input parameter is?1
, the second is?2
, and so forth.The following rules apply to input parameters:
Conditional Expressions
A
WHERE
clause consists of a conditional expression, which is evaluated from left to right within a precedence level. You can change the order of evaluation by using parentheses.Operators and Their Precedence
Table 27-2 lists the query language operators in order of decreasing precedence.
BETWEEN Expressions
A
BETWEEN
expression determines whether an arithmetic expression falls within a range of values.These two expressions are equivalent:
The following two expressions are also equivalent:
If an arithmetic expression has a
NULL
value, then the value of theBETWEEN
expression is unknown.IN Expressions
An
IN
expression determines whether or not a string belongs to a set of string literals, or whether a number belongs to a set of number values.The path expression must have a string or numeric value. If the path expression has a
NULL
value, then the value of theIN
expression is unknown.In the following example, if the country is
UK
the expression isTRUE
. If the country isPeru
it isFALSE
.You may also use input parameters:
LIKE Expressions
A
LIKE
expression determines whether a wildcard pattern matches a string.The path expression must have a string or numeric value. If this value is
NULL
, then the value of theLIKE
expression is unknown. The pattern value is a string literal that can contain wildcard characters. The underscore (_) wildcard character represents any single character. The percent (%
) wildcard character represents zero or more characters. TheESCAPE
clause specifies an escape character for the wildcard characters in the pattern value. Table 27-3 shows some sampleLIKE
expressions.
NULL Comparison Expressions
A
NULL
comparison expression tests whether a single-valued path expression or an input parameter has aNULL
value. Usually, theNULL
comparison expression is used to test whether or not a single-valued relationship has been set.This query selects all teams where the league relationship is not set. Please note, the following query is not equivalent:
The comparison with
NULL
using the equals operator (=
) always returns an unknown value, even if the relationship is not set. The second query will always return an empty result.Empty Collection Comparison Expressions
The
IS [NOT] EMPTY
comparison expression tests whether a collection-valued path expression has no elements. In other words, it tests whether or not a collection-valued relationship has been set.If the collection-valued path expression is
NULL
, then the empty collection comparison expression has aNULL
value.Here is an example that finds all orders that do not have any line items:
Collection Member Expressions
The
[NOT] MEMBER [OF]
collection member expression determines whether a value is a member of a collection. The value and the collection members must have the same type.If either the collection-valued or single-valued path expression is unknown, then the collection member expression is unknown. If the collection-valued path expression designates an empty collection, then the collection member expression is
FALSE
.The
OF
keyword is optional.The following example tests whether a line item is part of an order:
Subqueries
Subqueries may be used in the
WHERE
orHAVING
clause of a query. Subqueries must be surrounded by parentheses.The following example find all customers who have placed more than 10 orders:
Exists Expressions
The
[NOT] EXISTS
expression is used in conjunction with a subquery, and is true only if the result of the subquery consists of one or more values and that is false otherwise.The following example finds all employees whose spouse is also an employee:
SELECT DISTINCT emp FROM Employee emp WHERE EXISTS ( SELECT spouseEmp FROM Employee spouseEmp WHERE spouseEmp = emp.spouse)All and Any Expressions
The
ALL
expression is used in conjunction with a subquery, and is true if all the values returned by the subquery are true, or if the subquery is empty.The
ANY
expression is used in conjunction with a subquery, and is true if some of the values returned by the subquery are true. AnANY
expression is false if the subquery result is empty, or if all the values returned are false. TheSOME
keyword is synonymous withANY
.The
ALL
andANY
expressions are used with the=
,<
,<=
,>
,>=
,<>
comparison operators.The following example finds all employees whose salary is higher than the salary of the managers in the employee's department:
SELECT emp FROM Employee emp WHERE emp.salary > ALL ( SELECT m.salary FROM Manager m WHERE m.department = emp.department)Functional Expressions
The query language includes several string and arithmetic functions which may be used in the
WHERE
orHAVING
clause of a query. The functions are listed in the following tables. In Table 27-4, thestart
andlength
arguments are of typeint
. They designate positions in theString
argument. The first position in a string is designated by 1. In Table 27-5, thenumber
argument can be either anint
, afloat
, or adouble
.
The
CONCAT
function concatenates two strings into one string.The
LENGTH
function returns the length of a string in characters as an integer.The
LOCATE
function returns the positon of a given string within a string. It returns the first position at which the string was found as an integer. The first argument is the string to be located. The second argument is the string to be searched. The optional third argument is an integer that represents the starting string position. By default,LOCATE
starts at the beginning of the string. The starting position of a string is1
. If the string cannot be located,LOCATE
returns0
.The
SUBSTRING
function returns a string that is a substring of the first argument based on the starting position and length.The
TRIM
function trims the specified character from the beginning and/or end of a string. If no character is specified,TRIM
removes spaces or blanks from the string. If the optionalLEADING
specification is used,TRIM
removes only the leading character(s) from the string. If the optionalTRAILING
specification is used,TRIM
removes only the trailing character(s) from the string. The default isBOTH
, which removes the leading and trailing character(s) from the string.The
LOWER
andUPPER
functions convert a string to lower or upper case, respectively.
Table 27-5 Arithmetic Expressions Function Syntax Return TypeABS(number)
int
,float
, ordouble
MOD(int, int)
int
SQRT(double)
double
SIZE(Collection) int
The
ABS
function takes a numeric expression and returns a number of the same type as the argument.The
MOD
function returns the remainder of the first argument divided by the second.The
SQRT
function returns the square root of a number.The
SIZE
function returns an integer of the number of elements in the given collection.NULL Values
If the target of a reference is not in the persistent store, then the target is
NULL
. For conditional expressions containingNULL
, the query language uses the semantics defined by SQL92. Briefly, these semantics are as follows:
- If a comparison or arithmetic operation has an unknown value, it yields a
NULL
value.- Two
NULL
values are not equal. Comparing twoNULL
values yields an unknown value.- The
IS NULL
test converts aNULL
persistent field or a single-valued relationship field toTRUE
. TheIS NOT NULL
test converts them toFALSE
.- Boolean operators and conditional tests use the three-valued logic defined by Table 27-6 and Table 27-7. (In these tables, T stands for
TRUE
, F forFALSE
, and U for unknown.)Equality Semantics
In the query language, only values of the same type can be compared. However, this rule has one exception: Exact and approximate numeric values can be compared. In such a comparison, the required type conversion adheres to the rules of Java numeric promotion.
The query language treats compared values as if they were Java types and not as if they represented types in the underlying data store. For example, if a persistent field could be either an integer or a
NULL
, then it must be designated as anInteger
object and not as anint
primitive. This designation is required because a Java object can beNULL
but a primitive cannot.Two strings are equal only if they contain the same sequence of characters. Trailing blanks are significant; for example, the strings
'abc'
and'abc '
are not equal.Two entities of the same abstract schema type are equal only if their primary keys have the same value. Table 27-8 shows the operator logic of a negation, and Table 27-9 shows the truth values of conditional tests.
Table 27-9 Conditional Test Conditional Test T F U ExpressionIS TRUE
T F F ExpressionIS FALSE
F T F Expression is unknown F F T
SELECT Clause
The
SELECT
clause defines the types of the objects or values returned by the query.Return Types
The return type of the
SELECT
clause is defined by the result types of the select expressions contained within it. If multiple expressions are used, the result of the query is anObject[]
, and the elements in the array correspond to the order of the expressions in theSELECT
clause, and in type to the result types of each expression.A
SELECT
clause cannot specify a collection-valued expression. For example, theSELECT
clausep.teams
is invalid becauseteams
is a collection. However, the clause in the following query is valid because thet
is a single element of theteams
collection:The following query is an example of a query with multiple expressions in the select clause:
It returns a list of
Object[]
elements where the first array element is a string denoting the customer name and the second array element is a string denoting the name of the customer's country.Aggregate Functions in the SELECT Clause
The result of a query may be the result of an aggregate function.
For select method queries with an aggregate function (
AVG
,COUNT
,MAX
,MIN
, orSUM
) in theSELECT
clause, the following rules apply:The following example returns the average order quantity:
The following example returns the total cost of the items ordered by Roxane Coss:
SELECT SUM(l.price) FROM Order o JOIN o.lineItems l JOIN o.customer c WHERE c.lastname = 'Coss' AND c.firstname = 'Roxane'The following example returns the total number of orders:
The following example returns the total number of items in Hal Incandenza's order that have prices:
SELECT COUNT(l.price) FROM Order o JOIN o.lineItems l JOIN o.customer c WHERE c.lastname = 'Incandenza' AND c.firstname = 'Hal'The DISTINCT Keyword
The
DISTINCT
keyword eliminates duplicate return values. If a query returns ajava.util.Collection
--which allows duplicates--then you must specify theDISTINCT
keyword to eliminate duplicates.Constructor Expressions
Constructor expressions allow you to return Java instances that store a query result element instead of an
Object[]
.The following query creates a
CustomerDetail
instance perCustomer
matching theWHERE
clause. ACustomerDetail
stores the customer name and customer's country name. So the query returns aList
ofCustomerDetail
instances:SELECT NEW com.xyz.CustomerDetail(c.name, c.country.name) FROM customer c WHERE c.lastname = 'Coss' AND c.firstname = 'Roxane'ORDER BY Clause
As its name suggests, the
ORDER BY
clause orders the values or objects returned by the query.If the
ORDER BY
clause contains multiple elements, the left-to-right sequence of the elements determines the high-to-low precedence.The
ASC
keyword specifies ascending order (the default), and theDESC
keyword indicates descending order.When using the
ORDER BY
clause, theSELECT
clause must return an orderable set of objects or values. You cannot order the values or objects for values or objects not returned by theSELECT
clause. For example, the following query is valid because theORDER BY
clause uses the objects returned by theSELECT
clause:SELECT o FROM Customer c JOIN c.orders o JOIN c.address a WHERE a.state = 'CA' ORDER BY o.quantity, o.totalcostThe following example is not valid because the
ORDER BY
clause uses a value not returned by theSELECT
clause:SELECT p.product_name FROM Order o, IN(o.lineItems) l JOIN o.customer c WHERE c.lastname = 'Faehmel' AND c.firstname = 'Robert' ORDER BY o.quantityThe GROUP BY Clause
The GROUP BY clause allows you to group values according to a set of properties.
The following query groups the customers by their country and returns the number of customers per country:
The HAVING Clause
The
HAVING
clause is used with theGROUP BY
clause to further restrict the returned result of a query.The following query groups orders by the status of their customer and returns the customer status plus the average
totalPrice
for all orders where the corresponding customers has the same status. In addition it considers only customer with status1
,2
, or3
, so orders of other customers are not taken into account: